M+E Technology Job Board

Principal Data Scientist, AI Labs

  • linkedin
  • fb
  • twitter
  • google plus
  • email
  • Full Time
  • Remote

Veritone

WHAT YOU’LL DO

Utilize advanced data science techniques to develop and deploy models utilizing LLMs, GPT and RAG for various applications that delight our customers and offer new or enhanced functionality on our platform.
Propose and deliver AI projects into production from end to end to drive business growth.
Work with the best models available either commercially or in open source.
Evaluate models to identify what works best in terms of accuracy, performance and cost.
Develop data-driven storytelling narratives to effectively communicate findings and recommendations to stakeholders.

WHAT YOU’LL NEED

Expert knowledge of LLMs, Retrieval Augmented Generation (RAG), Multimodal Foundation Models.
Deep learning proficiency, especially sequence- and generative models (Attention, Transformers, GANs, Diffusion etc.).
Excellent communication (written / verbal).
Experience with Python, PyTorch, Langchain/LlamaIndex, Jupyter Notebooks, git, Docker, vector databases.
Experience with designing and implementing APIs that leverage AI models.
Experience implementing state of the art RAG pipelines for question answering systems, Chatbots or Copilots.
Experience processing unstructured data for LLM applications.
Experienced with embedding models for multiple modalities (e.g., text, image, video, etc.).
Experience running LLMs locally (e.g., Ollama, llama.cpp, Llamafile, etc.).
Awareness of AI related risk and potential countermeasures.

BONUS POINTS IF

PhD from a top university in one of the following: Computer Science, Electrical Engineering, Signal Processing, Physics, Robotics, Machine Learning, Applied Statistics, or Applied Mathematics.
Having participated in and placing well in Kaggle competitions.
Experience developing software in C++, Java, Go or similar.
Experience in deep learning in Computer Vision, Speech/Audio processing or similar.
Experience creating data preprocessors and exploratory data analysis.
Experience deploying models using AWS, Huggingface, and Replicate.
Experience creating UIs using Streamlit or Gradio.