Full-Time Data Engineer
Sayari is hiring a remote Full-Time Data Engineer. The career level for this job opening is Experienced and is accepting Worldwide based applicants remotely. Read complete job description before applying.
Sayari
Job Title
Posted
Career Level
Career Level
Locations Accepted
Share
Job Details
Sayari’s flagship product, Sayari Graph, provides instant access to structured business information from billions of corporate, legal, and trade records. As a member of Sayari's data team you will work with the Product and Software Engineering teams to collect data from around the globe, maintain existing data pipelines, and develop new pipelines that power Sayari Graph.
Job Responsibilities:
- Write and deploy crawling scripts to collect source data from the web
- Write and run data transformers in Scala Spark to standardize bulk data sets
- Write and run modules in Python to parse entity references and relationships from source data
- Diagnose and fix bugs reported by internal and external users
- Analyze and report on internal datasets to answer questions and inform feature work
- Work collaboratively on and across a team of engineers using agile principles
- Give and receive feedback through code reviews
Skills & Experience:
- Professional experience with Python and a JVM language (e.g., Scala)
- 2+ years of experience designing and maintaining data pipelines
- Experience using Apache Spark and Apache Airflow
- Experience with SQL and NoSQL databases (e.g., columns stores, graph, etc.)
- Experience working on a cloud platform like GCP, AWS, or Azure
- Experience working collaboratively with Git
- Understanding of Docker/Kubernetes
- Interest in learning from and mentoring team members
- Experience supporting and working with cross-functional teams in a dynamic environment
- Passionate about open source development and innovative technology
- Experience working with BI tools like BigQuery and Superset is a plus
- Understanding of knowledge graphs is a plus