Neptune Bio · 2 months ago
Computational Research Associate
Neptunebio is a startup focused on functional genomics and single-cell perturbation experiments. They are seeking a Computational Research Associate to support data analysis and infrastructure efforts, working with large biological datasets and collaborating with scientists to translate data into biological results.
Biotechnology
Responsibilities
Process and analyze large-scale single-cell and perturb-seq datasets using established computational pipelines
Develop, document, and maintain reproducible analysis workflows and data processing infrastructure
Support data management and organization across multiple internal and external datasets
Collaborate closely with experimental and computational scientists to translate raw data into interpretable biological results
Implement and optimize pipelines in cloud environments (e.g., AWS, GCP) for scalable data processing
Maintain codebases, perform quality control on data outputs, and ensure reproducibility and traceability of analyses
Generate clear reports, visualizations, and summaries to communicate results across teams
Qualification
Required
B.S. or M.S. in Bioinformatics, Computational Biology, Computer Science, or a related quantitative field
2+ years of experience working with biological or single-cell datasets
Proficiency in Python and/or R for data analysis and visualization
Familiarity with standard genomics tools and file formats (FASTQ, BAM, HDF5, AnnData, etc.)
Experience using and maintaining analysis pipelines in a Unix/Linux environment
Experience working with cloud compute platforms (AWS, GCP, or similar)
Strong organizational skills, attention to detail, and commitment to clean, reproducible code
Preferred
Experience analyzing single-cell RNA-seq or perturb-seq datasets
Familiarity with workflow management systems (Nextflow, Snakemake, or similar)
Experience with containerization tools such as Docker
Exposure to data engineering concepts (e.g., databases, versioned data storage, data pipelines)
Understanding of basic statistical methods for genomics data analysis