Senior Data Engineer

Come join DSV as Senior Data Engineer with focus on Data Sourcing, Real-time event streaming data flows and many other exiting tasks. 

We are a team with some of the most knowledgeable colleagues within IT development. Here we share knowledge and new ideas for great solutions. Which could be fine tune partitioning, replicas and offsets to improve performance, or how to build scalable and robust data flows that can process large volumes of data in near-realtime. You would also be working with:

  • Sourcing data from systems in DSV via the appropriate pattern for the use case (e.g. REST services, event streaming) or in the beginning of projects before the real integration is established (e.g. simple file dumps on FTP servers)
  • Building near real-time event streaming data flows with separated micro services for different functional purposes (e.g. file upload, image extraction, file extraction, etc.)
  • Re-configure the event streaming broker for optimization
  • Establish domain-based ontologies for data to ensure semantic interoperability between different systems
  • Making data available in the needed form to use in ML models and for visualization by business applications with a frontend – and collect the inputs of the users as data enrichment from the frontend
  • Working together with the logging team, who set guidelines for how to structure your log outputs
  • Analyzing statistics of data flows to identify bottlenecks – and to remove these to improve the data flow
  • Logging key aspects of data flows to ensure observability and lineage
  • Using well-structured naming conventions

The use cases

The focus of our team is to build advanced end-2-end solutions that create direct business value for DSV’s divisions, including for example:

  • Customs declaration automation
  • Vendor invoices automation
  • Address validation
  • ETA prediction
  • And many more to come…

The word “advanced” is used to underline that the use cases we solve tend to have a high degree of complexity, requiring non-deterministic problem solving (i.e. the use of ML/AI), near real-time data processing, a need for high availability, vertical and horizontal scalability and a very high volume of transactions. However, fancy technologies and accurate ML models do not solve the issues at hand alone; we strive to combine our competencies to build holistic solutions where the underlying complexity is hidden for the user to create simple and value-adding experiences.

Experienced Data Engineer 

You always think about tomorrow’s requirements when designing a data flow, as well as a holistic view by understanding the business context and to help the data scientists getting the best data for their models and the application developers getting the data needed for the UI. We have a lot of responsibility – both for exciting R&D work to push the boundaries of data processing, but also for doing the necessary nitty gritty work in the data flow. We therefore would like you to have experience with the following:

  • Automation testing and evaluation of the performance of your data pipeline
  • Break down the solutions into iterations so they can deliver value quickly in MVP versions before they are enriched with more nice-to-have functionality in later iterations
  • Make realistic mockups of data to allow you to test things swiftly on synthetic data before you get access to production data
  • Ensure semantic interoperability between different systems by standardizing data definitions

We expect you to have experience with most of the following technologies:

  • Event streaming: Confluent Kafka, Kafka streams, KSQL
  • Logging & Monitoring: Elastic, Kibana, Grafana, Logstash, FluentD
  • Coding languages: Python, Java, Scala
  • Storage technologies such as:
    • SQL database: We use MySQL
    • NoSQL database: We use MongoDB
    • File systems: E.g. Azure Files, Filestore, etc.
    • Blob storage: E.g. Azure Blob Storage, S3, etc.
  • Version control: Git (we use Atlassian BitBucket as a wrapper on top of Git)
  • Containerization: Docker/containerd
  • Container orchestration: Kubernetes
  • OS: OS Linux (CentOS/RHCOS) and Win
  • BI tools: PowerBI, Qlik, etc.
  • Moderate experience with Cloud platforms: AWS, Azure, T-platforms, GCP

It is a bonus if you also have some experience with some of the other technologies that our team works with, such as:

  • Other data processing frameworks: E.g. Spark or Ray
  • ML model serving: TensorFlow serving, Torch serving
  • Authentication: Open ID Connect 2.0 (we use Red Hat KeyCloak as identity broker)
  • CI/CD Pipelines: Jenkins (our templates are written in Groovy) and AzureDevOps
  • Load balancing: NGINX
  • Installation scripts: Ansible
  • Requirements: Jira
  • Documentation: Confluence
  • Frontend technologies: React JS, Material UI, JavaScript/TypeScript, Redux 
  • Test framework: Jest
  • ML Frameworks: TensorFlow / PyTorch

Our team

We are an ambitious team with a flat hierarchy and a mix of young and very experienced persons, who are working according to the following principles:

  • We celebrate victories together
  • We take responsibility for mistakes and learn from them 
  • We design for scale but build only for the near future
  • We value working software and informal alignment over tedious documentation 
  • For high-risk areas of the with complex business value 
  • We make decisions based on knowledge and insight rather than hierarchical structures
  • Everyone can speak their honest opinion

We have all the needed competencies to build awesome products inside the team:

  • Product owner
  • Business analysts
  • Application developers (frontend + backend)
  • Data engineers, Data scientists
  • ML engineers
  • DevOps engineers

Want to know more and apply? 

We will be happy to answer any questions you may have regarding the position and about your options in DSV. You are welcome to call Lead Data Engineer Sergey Boldyrev at +358 504873843.

We look forward to receiving your application via the link below as soon as possible. We will process the applications as we receive them.

DSV – Global Transport and Logistics

DSV is one of the very best performing companies in the transport and logistics industry. 75,000 employees in more than 90 countries work passionately to deliver great customer experiences and high-quality services – as part of the operation or in a variety of supporting roles. If you have drive and talent and enjoy responsibility, we’ll give you the support you need to explore your potential and forward your career.

Read more at www.dsv.com

Gem job