Waveum · 3 days ago
Junior Data Science Analyst
Waveum is a fast-growing technology company that builds high-performance, real-time platforms inspired by the principles of modern trading systems. The Junior Data Science Analyst will support Waveum’s healthcare AI initiatives, focusing on applied AI, document intelligence, and human-in-the-loop machine learning, working with real healthcare data and documents.
BlockchainEnterprise SoftwareRetail
Responsibilities
Support the development of AI-powered data pipelines that process healthcare data and documents
Work with structured, semi-structured, and unstructured data, including text extracted from healthcare documents
Assist in preparing datasets used for AI-driven classification, matching, and validation workflows
Support OCR-based document processing pipelines for healthcare artifacts such as: prescriptions, clinical forms, referrals, supporting documentation
Help analyze OCR outputs, identify extraction errors, and contribute to normalization and validation logic
Work with text extracted from PDFs and scanned documents to enable downstream AI processing
Apply pragmatic ML and NLP techniques for: classification, text similarity, entity resolution, normalization of healthcare and product data
Use string similarity and fuzzy matching techniques for real-world matching problems (e.g., token-based similarity, edit-distance methods, heuristic matching)
Support evaluation of AI outputs with attention to confidence, edge cases, and failure modes
Contribute to AI workflows composed of multiple steps or agents, where each step performs a specific task (extraction, matching, validation, enrichment)
Help design and refine human-in-the-loop review mechanisms, including confidence thresholds and escalation logic
Assist in testing and validating AI-driven decisions before they are surfaced to users
Support engineering teams with data exploration, validation logic, and analytical insights
Qualification
Required
1–3 years of experience (or strong academic/internship projects) in data science, applied machine learning, analytics or AI-focused roles
Strong proficiency in Python, including pandas, numpy, data cleaning and transformation
Working knowledge of SQL
Knowledge of Java programming (Spring) is a big plus
Solid understanding of statistics fundamentals
Exposure to applied AI or ML systems used in production or realistic projects
Familiarity with OCR concepts and document-based data processing (tools, pipelines, or coursework)
Understanding of basic NLP concepts such as tokenization, similarity, and text normalization
Familiarity with string similarity and fuzzy matching techniques for entity resolution and normalization (e.g., token-based similarity, edit-distance methods)
Awareness of human-in-the-loop AI and explainability requirements
Bachelor's or Master's degree in Data Science, Computer Science, Statistics, Engineering, or a related quantitative field
Healthcare-focused or AI-focused coursework/projects are a strong plus
Preferred
Experience with healthcare data
Experience working with text extracted from PDFs or scanned documents
Familiarity with embeddings, semantic similarity, or LLM-assisted workflows
Exposure to agent-based or multi-step AI workflows
Experience in regulated environments (HIPAA, SOC 2, HITRUST)
Company
Waveum
Waveum is the first-ever collaborative network with complete end-to-end encryption.
Funding
Current Stage
Early StageTotal Funding
unknown2018-02-18Non Equity Assistance
Company data provided by crunchbase