alt alt

Using POS tagging to extract genetic code from patients

This project aimed to develop a knowledge graph system that could extract genetic information from unstructured text files. As the team leader, the project planning and research were my primary responsibilities. I also managed the working progress of my team members and reviewed their project designs.

The team developed a data pipeline to extract gene names, sequences, and organism names from patients’ data. We integrated the ETL pipeline with the BLAST+ database to classify the taxonomy category of the genetic information. This process helped us to consolidate the intellectual acquisition with other information and build a next-level business value.

The PoC project showed promising results, demonstrating the potential for knowledge graph systems to extract valuable information from unstructured data sources. As the project leader, I was pleased to have contributed to the development of this innovative solution.

Updated: