Company: Collegis Education
Position: Senior Google Data Engineer
Project Objective:
Developed and maintained scalable data pipeline and solution to empower colleges and universities in achieving strategic outcomes such as enrollment growth, anytime, anywhere learning, and a sustainable future.
Technologies and Tools:
- Programming Language: Python
- Data Transformation: DBT (Data Build Tool)
- Data Integration: Fivetran
- Data Visualization: ThoughtSpot
- Google Cloud Platform (GCP) Services:
- Cloud Functions
- Cloud Composer
- Cloud Scheduler
- Speech-to-Text API
- Cloud Natural Language API
- Cloud Run
- Cloud Storage
- BigQuery
Key Accomplishments:
- Developed near-real-time, event-based ETL pipelines using Google Cloud Functions to extract data from external APIs, such as Phoneburner/Five9 and LMS Canvas.
- Implemented efficient data fetching mechanisms to ensure data freshness and availability.
- Optimized pipeline performance to handle high volumes of data and minimize latency.
- Enabled Continuous Integration and Continuous Deployment (CI/CD) using GitHub and GitHub Actions.
- Automated the build, test, and deployment processes to ensure code quality and reliability.
- Streamlined the development workflow and reduced manual intervention.
- Created ETL pipelines using Fivetran to integrate data from multiple sources, including SQL Server, Google Sheets, and Salesforce.
- Configured data connectors and mappings to ensure seamless data integration.
- Implemented automatic scaling and schema evolution handling to accommodate data growth and changes.
- Utilized DBT (Data Build Tool) to transform and document data in BigQuery.
- Developed DBT models, seeds, and exposures to define data transformations and business logic.
- Created comprehensive documentation to facilitate data understanding and usage.
- Generated data lineage using DBT to provide a clear understanding of data flow and dependencies.
- Enabled data traceability and impact analysis for effective data governance.
- Facilitated data auditing and compliance requirements.
- Built interactive dashboards using ThoughtSpot to provide insightful reports and data visualizations.
- Collaborated with stakeholders to identify key metrics and reporting requirements.
- Designed intuitive and visually appealing dashboards to support data-driven decision-making.
- Implemented audio call extraction from Phoneburner/Five9, transcribed the audio using Cloud Speech-to-Text API, and performed sentiment analysis using Cloud Natural Language API.
- Enabled advanced analytics and insights on customer interactions and sentiment.
- Integrated the processed data into BigQuery for further analysis and reporting.