Alexander Karpekov
PhD Student in Computer Science. Data Scientist.
Political Science and Economics student turned Big Tech guy turned Computer Science grad student. After spending almost ten years as a Data Scientist in the industry (seven years at Google, and three years at a startup), I decided to go back to school to pursue a PhD in Computer Science with a focus on AI and Machine Learning.
I have a particular interest in the field of Explainable Artificial Intelligence, and the intersection of AI and social sciences. I am also always looking to explore how to use data visualizations for storytelling.
I am always open to new opportunities to collaborate on projects at the intersection of AI and other fields, so feel free to reach out!
Education
Georgia Institute of Technology
Atlanta, GAPhD in Computer Science
Current research areas focus on Interactive Computing, Explainable AI, and Ubiquitous Computing.
Advisors: Sonia Chernova and Thomas Plötz .
Georgia Institute of Technology
Atlanta, GAMSc in Computer Science | GPA: 3.9/4.0
Completed 2nd Master’s Degree remotely while working full time at Google. Focused on Machine Learning and Artificial Intelligence.
University of California, San Diego
San Diego, CAMA in Economics | GPA: 3.8/4.0
Worked as a Teaching Assistant for 3 graduate-level classes in Statistics and Econometrics, leading sessions for 120+ students. Received the best TA award. Regional focus was on China. Studied Mandarin Chinese.
MGIMO University
Moscow, RussiaBA in Political Science | GPA: 92/100
Studied Comparative Politics. Languages: English, French. Thesis on History of Migration in the United Kingdom.
Industry Experience
Senior Data Scientist (L5)
Worked as a Data Scientist in Google Search and YouTube Music, with the main focus on statistical data analysis and A/B experiment design and evaluation to improve search quality and music recommendations. Was promoted twice to L5. Presented my work and findings at regular director, VP, and executive level meetings, including YouTube CEO Susan Wojcicki. A few notable projects:
- Developed a pathfinding algorithm in song embedding space, improving music recommendations that led to 3% boost in user engagement and music discovery rates. This work was presented at Google-level Data Science Conference in 2023.
- Implemented a new methodology to cluster YouTube multi-billion music corpus using text, sound, search, and co-watch embeddings, which led to a 30% reduction in harmful watchtime and a 0.5% increase in music revenue ($100s millions).
- Created a new counterfactual causal impact methodology to evaluate the impact of the new feature launch on user engagement and conversion that helped establish no statistically significant long-term effects on key business metrics. The analysis was instrumental to halt the global rollout at Engineering and Product VP-level.
Dataminr
London, UK & New York, NYData Analyst
Worked as a Data Analyst in the Data Science team, focusing on Twitter data analysis and news discovery algorithms.
- Built statistical models to automatically classify Twitter user handles.
- Conducted Twitter user clustering and unsupervised learning using networks analysis methodologies to improve news discovery algorithms.
- Led company-wide effort for reporting automation using Python instead of Excel.
Publications
DISCOVER: An Unsupervised Approach to Cluster and Label Human Activities in Smart Homes
Currently Under ReviewAlexander Karpekov, Sonia Chernova, Thomas Plötz
In this paper we introduce DISCOVER, an active-learning method to identify fine-grained human activities from unlabeled smart home sensor data. DISCOVER combines self-supervised feature extraction and embedding clustering with a custom built visualization tool, which allows researchers to identify, label, and track human activities and changes over time.
Transformer Explainer: Interactive Tool to Learn about LLMs
IEEE Viz, AAAIAeree Cho, Grace Kim, Alexander Karpekov, et al.
An interactive visualization tool that helps users understand how transformer models work through hands-on experimentation and real-time feedback.
- Best Poster Award at IEEE Viz 2024
- Went Viral: 150K+ visitors in the first 3 months
Is Attention Truly All We Need?
Deep Learning for Text: Final ProjectAlexander Karpekov, Sidney Miller
This project investigates the use of Transformer attention weights for deriving feature importance in NLP tasks, demonstrating that combining attention weights with gradient information improves explainability and providing an open-source GitHub tool for applying this method to any Transformer model.
Double-Relocation Policy Evaluation in Guangdong, China using Night Lights Data
ArcGIS: Final ProjectAlexander Karpekov
This project examines Guangdong's shifting economic growth using Night Lights data from satelites, focusing on development beyond the Pearl River Delta and the impact of 2008 government policies.
Skills
Programming
Python
SQL
TypeScript
R
Stata
C
Java
ML & DS
PyTorch
Hugging Face
TensorFlow
Keras
Scikit-learn
Statsmodels
XGBoost
Data & Viz
NumPy
Pandas
SciPy
Jupyter
Colab
Matplotlib
Altair
Plotnine
Frontend
Svelte
D3
HTML
Tailwind CSS
Figma
Illustrator
Languages
English
Russian
French
German
Mandarin
Latin
Ancient Greek