Data Engineer

San Francisco, CA

Contracted

Mid Level

FocusKPI is looking for a Data Engineer to join or internal team and help one of our clients with their projects.

Our internal team is looking for a non-traditional Data Engineer (DE) —one that blends Data Engineering (DE) and Data Science (DS). This person needs to develop and optimize data architectures that support business intelligence, predictive analytics, and AI/ML applications. This role involves designing and structuring data pipelines that connect SAP, Databricks, and Microsoft CRM, enabling trend analysis, forecasting models, and AI-driven insights. The focus is not just on ETL pipelines but on building a feature store rather than traditional data models.

This role requires a deep understanding of how data should be organized to support efficient reporting, historical trend analysis, and advanced analytics.

Given our global operations, fluency in both English and Chinese is highly beneficial

Work Location: Remote - anywhere in the US
Duration: 9+ months contract
Pay Range: $60/hr to $85/hr (on basis of experience level)

Responsibilities:

Develop and implement scalable data models that link SAP, Databricks, and Microsoft CRM, ensuring they support business intelligence, forecasting, and AI-driven analytics.
Build robust data pipelines to extract, process, and structure information from SAP HANA, SAP BW, OData, and Microsoft CRM, ensuring accuracy and usability for analytical tools.
Design data schemas that enhance historical trend analysis, predictive modeling, and performance monitoring, rather than simply storing raw transactional data.
Construct data warehouses and structured datasets that allow for efficient querying and insightful analysis, reducing the need for complex transformations downstream.
Ensure data processing frameworks can accommodate both real-time updates and scheduled batch processing, supporting diverse analytical needs.
Work closely with stakeholders across business functions to align data structures with operational goals, ensuring usability and relevance.
Automate data ingestion and transformation using Python, SQL, and cloud-based technologies, streamlining data workflows.
Implement data governance policies, ensuring compliance with security protocols, access management, and audit logging.
Maintain and troubleshoot data pipelines, minimizing downtime and ensuring smooth data availability for reporting and analytics teams.
Develop metadata and lineage tracking strategies, improving data transparency and usability across the organization.

Candidate's Persona:

1. Strong Data Engineering (DE) Background

Experience in SQL, Python, ETL, and data modeling
Hands-on with Databricks including Delta Lake storage, efficient partitioning strategies, and query performance tuning
Familiar with Apache Airflow for orchestration
Experience with DBT (nice to have)

2. Experience in Feature Engineering & Data Science (DS)

Some exposure to data modeling and ML feature engineering
Experience with feature store development is a plus, or it can be described as "support predictive modeling"
Proficiency in Python for data manipulation & ML
Familiarity with ML libraries like sci-kit-learn
Experience with ML platforms
Prior exposure to predictive modeling/propensity modeling is a plus

3. Business Acumen & Predictive Feature Design

Strong understanding of SAP and CRM data
Ability to identify features with predictive power
Experience in creating business-relevant features, e.g., RFM (Recency, Frequency, Monetary) modeling
Familiarity with Microsoft CRM (preferably Dynamics) and its underlying data structures for business analysis.

Qualifications:

3-6 years of experience in data engineering, with a strong background in data modeling and integration.
Proficiency in Python and SQL for data transformation, pipeline automation, and performance tuning.
Deep knowledge of Databricks,
Expertise in SAP data structures, including SAP HANA, SAP BW, OData, BAPI, and IDocs.
Hands-on experience designing optimized data architectures that support trend analysis, forecasting models, and AI-powered insights.
Experience structuring data to facilitate business intelligence reporting, advanced forecasting, and predictive modeling.
Ability to manage large-scale data pipelines, optimizing them for performance and scalability.
Strong understanding of workflow orchestration tools (e.g., Apache Airflow, Prefect, or Azure Data Factory) to automate and schedule data tasks.
Prior experience implementing security controls, access management, and governance frameworks to maintain data integrity.
Bilingual proficiency in Chinese and English is highly preferred.
Business-oriented mindset, with the ability to align data structures with operational and strategic goals.

Nice-to-have Qualifications:

Experience in forecasting, predictive analytics, or AI/ML model deployment. Familiarity with automated ML pipelines
Familiarity with CI/CD pipelines for managing and deploying data infrastructure.
MLOps, DevOps, Cloud Services skills
Exposure to cloud-based data lake architectures, optimizing storage for cost and performance.
Knowledge of metadata-driven data engineering, improving discoverability and tracking across datasets.
Advanced education (MBA, MA, or Ph.D.) is a plus, as we welcome professionals with strong analytical backgrounds

**No C2C profiles are accepted**

Thank you!

FocusKPI Hiring Team

Founded in 2010, FocusKPI, Inc. (FocusKPI) is a data science and technology firm specializing in predictive analytics practice and methodologies. FocusKPI is a US company headquartered in Silicon Valley, California, with an East Coast office in Boston, Massachusetts.

NOTICE: Please be aware of fraudulent emails regarding job postings, job offers and fake checks. FocusKPI's recruiting team will strictly reach out via @focuskpi.com email domain. If you have received fraudulent emails now or in the past, please report it to https://reportfraud.ftc.gov/ .
The domain @focuskpijobs.com is fraudulent and not related to FocusKPI. Please do not not reply or communicate to anyone with @focuskpijobs.com.

Apply for this position

Required*

Apply with Indeed

First Name*

Last Name*

Email Address*

Phone*

Address

Resume*

We've received your resume. Click here to update it.

Attach resume or Paste resume

Attach resume as .pdf, .doc, .docx, .odt, .txt, or .rtf (limit 5MB) or Paste resume

Paste your resume here or Attach resume file

Human Check*

Submit Application