AI/ML Engineer
Greetings from Javen Technologies Inc.,
Job Title: AI/ML Engineer AWS Bedrock RAG Model Location: 100% Remote Duration: 6+ Months with extensions
Required Skills (Must Have)
Technical (Core):
- AWS Bedrock handson invoking embedding & text models; experience with model customization / finetuning workflows.
- Terraform module authoring (composition, variable design, drift detection, environment promotion).
- Production RAG system experience (document processing, chunking strategies, embedding generation, retrieval optimization).
- Python engineering excellence (testable, modular code; familiarity with packaging, logging, dependency management).
- SageMaker training jobs (PyTorch estimator or equivalent, VPC config, KMSencrypted volumes & outputs).
- Vector search proficiency with OpenSearch (kNN / ANN, index design, embedding normalization).
- Step Functions (standard + distributed Map), Lambda, SQS retry patterns, SNS notifications.
- Data curation & labeling: structuring JSONL training/eval sets, metadata hygiene, dataset versioning practices.
- Retrieval & answer quality evaluation: recall@K, MRR (or similar), error categorization (hallucination vs. retrieval failure).
- Secure AWS networking: VPC subnet/AZ selection (including GPU AZ constraints), security groups, private endpoints.
- IAM & KMS usage for ML pipelines (role scoping, encryption at rest/in transit considerations).
- Observability: designing metrics/logging for pipeline latency, throughput, failure classification; CloudWatch dashboards/alarms.
- Practical prompt engineering & prompt lifecycle management (versioning, regression testing).
- Understanding of finetuning paradigms (full vs. parameterefficient (LoRA), overfitting mitigation, hyperparameter tradeoffs).
Nice to Have (Differentiators)
AWS & Platform:
- Bedrock Knowledge Bases, Agents, Guardrails early adoption / integration patterns.
- Amazon Q or similar copilots in internal tooling contexts.
- Hybrid retrieval (BM25 + vector fusion, rerankers) or experimentation with multivector approaches.
Search & ML Optimization:
- Advanced embedding strategies (domain adaptation, periodic regeneration policies).
- Index lifecycle management (ILM), hot/warm tiering, shard sizing heuristics.
- Experiment frameworks for retrieval (A/B harness, statistical significance testing).
Security & Compliance:
- Exposure to FFIEC or similar financial regulatory expectations (change control, logging, segregation of duties).
- Vault / secret management integration patterns (token renewal, secret rotation).
Data & Evaluation:
- Automated data quality pipelines (schema / semantic validation, anomaly detection).
- Prompt & answer regression harness (baseline answer store, delta classification).
Joshua Gidugu