Full-Time Senior Data Engineer (AI and ML frameworks)

Sigma Software is hiring a remote Full-Time Senior Data Engineer (AI and ML frameworks). The career level for this job opening is Senior Manager and is accepting Warsaw, Poland based applicants remotely. Read complete job description before applying.

This job was posted 7 months ago and is likely no longer active. We encourage you to explore more recent opportunities on our site. However, you may still try your luck using 'Apply Now' link below. We recommend focusing on newer listings available here.

Sigma Software

Job Title

Senior Data Engineer (AI and ML frameworks)

Posted

Career Level

Full-Time

Career Level

Senior Manager

Locations Accepted

Warsaw, Poland

Job Details

Data Standardization and Transformation: Convert diverse data structures from various EHR systems into a unified format based on FHIR standards. Map and normalize incoming data to the FHIR data model, ensuring consistency and completeness.

Kafka Integration: Consume and process events from the Kafka stream produced by the Data Writer Module. Deserialize and validate incoming data to ensure adherence to required standards.

Data Segmentation: Separate data streams for warehousing and AI model training, applying specific preprocessing steps for each purpose. Prepare and validate data for storage and machine learning model training.

Error Handling and Logging: Implement robust error handling mechanisms to track and resolve data mapping issues. Maintain detailed logs for auditing and troubleshooting purposes.

Data Ingestion and Processing: Use LLMs to extract structured data from EHRs, research articles, and clinical notes. Ensure semantic consistency and interoperability during data ingestion.

Knowledge Graph Construction: Integrate extracted data into a knowledge graph, representing entities and relationships for semantic data integration. Implement contextual understanding and querying of complex relationships within the knowledge graph (KG).

Advanced Predictive Modeling: Leverage KGs and LLMs to enhance data interoperability and predictive analytics. Develop frameworks for contextualized insights and personalized medicine recommendations.

Feedback Loop: Continuously update the knowledge graph with new data using LLMs, ensuring up-to-date and relevant insights.

Work Closely with Cross-Functional Teams: Collaborate with data scientists, AI specialists, and software engineers to design and implement data processing solutions. Communicate effectively with stakeholders to align on goals and deliverables.

Contribute to Engineering Culture: Foster a culture of innovation, collaboration, and continuous improvement within the engineering team.

Technical Skills:

  1. Deep understanding of patterns and software development practices for event-driven architectures
  2. Hands-on experience with stateful stream data processing solutions (Kafka or similar streaming platforms)
  3. Strong knowledge of data serialization/deserialization using various data formats (at minimum JSON and Avro), and integration with schema registries
  4. Proven Python software development expertise, with experience in data processing and integration
  5. Practical experience building end-to-end solutions with Apache Flink or a similar platform
  6. Experience with containerization and orchestration using Kubernetes (K8s) and Helm, especially on Google Kubernetes Engine (GKE)
  7. Familiarity with Google Cloud Platform (GCP) or a similar cloud platform
  8. Hands-on experience implementing data quality solutions for schema-on-read or schema-less data
  9. Hands-on experience integrating with Apache Kafka, particularly the Confluent Platform
  10. Familiarity with AI and ML frameworks
  11. Proficiency in SQL and experience with both relational and NoSQL databases
  12. Experience with graph databases like Neo4j or RDF-based systems
  13. Experience in the healthcare domain and familiarity with healthcare standards such as FHIR and HL7 for data interoperability

WOULD BE A PLUS: Experience with web data scraping.

Personal Profiles:

  • Strong problem-solving skills, with the ability to design innovative solutions for complex data integration and processing challenges
  • Excellent communication skills, with the ability to articulate complex technical concepts and work effectively with various stakeholders
  • Commitment to improving healthcare through data-driven solutions and technology
  • Stay abreast of the latest technologies and industry trends while continually improving your skills and knowledge
  • Ability to work in a collaborative environment, valuing diverse perspectives and contributing to a positive team culture

FAQs

What is the last date for applying to the job?

The deadline to apply for Full-Time Senior Data Engineer (AI and ML frameworks) at Sigma Software is 25th of May 2025 . We consider jobs older than one month to have expired.

Which countries are accepted for this remote job?

This job accepts [ Warsaw, Poland ] applicants. .

Related Jobs You May Like

Data Scientist

Alexandria, MN
3 days ago
Data Mining
data modeling
Python
LGC Group
Full-Time
Experienced
YEAR $90000 - $150000

Data Scientist

New York
1 week ago
Clinical Decision Support
Healthcare Data Analysis
Machine Learning
AnsibleHealth Inc.
Full-Time
Experienced

Middle AI Engineer / Data Scientist

Kolkata, India
1 week ago
Communication
Data Analysis
Machine Learning
Miratech
Full-Time
Experienced

Data Scientist (Mid - Senior Level)

London, United Kingdom
1 week ago
Data Analysis
Data Science
Machine Learning
Utility Warehouse
Full-Time
Experienced

Data Scientist - Focus on Engineering (Women Only)

São Carlos, Brazil
1 week ago
Data Analysis
Data Pipelines
Predictive Modeling
Experian
Full-Time
Experienced

Data Scientist Sênior - Agronegócio

São Carlos, Brazil
1 week ago
Data Analysis
Data Science
Machine Learning
Experian
Full-Time
Experienced

Junior AI Engineer / Data Scientist

All cities, India
1 week ago
Data Analysis
Machine Learning
NLP
Miratech
Full-Time
Entry Level

Middle AI Engineer / Data Scientist

All cities, India
1 week ago
AI/LLM Applications
Communication Skills
Machine Learning
Miratech
Full-Time
Experienced

Data Science Scientist

Heredia, Costa Rica
1 week ago
Data Analysis
Machine Learning
Python
Experian
Full-Time
Experienced

Middle AI Engineer / Data Scientist

All cities, India
1 week ago
Communication Skills
Data Analysis
Machine Learning
Miratech
Full-Time
Experienced

Lead Data Scientist

USA
2 weeks ago
Data Science
Machine Learning
Marketing Mix Models (MMM)
General Mills
Full-Time
Experienced
YEAR $126700 - $211200

Data Scientist

United States
2 weeks ago
Data Analysis
Generative AI
Machine Learning
Experian
Full-Time
Experienced

Looking for a specific job?