My Journey: Participating in the Google Long Context Window Competition on Kaggle

I participated in the Google Long Context Window competition on Kaggle. This was my first-ever direct participation in a Kaggle competition—or any sort of competition, for that matter. Previously, I was indirectly part of a competition where I worked on Unity development for a Premetal Hackathon, and our team won. But this time was different.

Why I Decided to Participate

On September 4, 2024, I quit my job. After that, I focused on upskilling myself and investing time into my own projects (portfolio). This competition became part of my portfolio work. Interestingly, I joined just Somedays before the deadline, even though I had known about it for one and a half months.

Why so late? I thought I needed more time to develop a solid idea around the "long context window" concept. Also I thought someone should not steal my idea:) But it turns out that the world is full of intelligent and talented people than you.

Eventually, I realized that participating, even with limited preparation, was better than missing out.

Benefits of Participating in Kaggle Competitions

I believe participating in competitions like these is invaluable. You can list them on your resume to stand out and demonstrate your skills. Competitions are also incredible learning and sharing opportunities, and if your idea works, there’s often significant prize money involved or some kind of badges or certifications is also awarded.

For instance, the Gemini Long Context Window Challenge had a $100,000 prize pool divided among four winners. Beyond the prize, you get to learn from how people across different regions approach the same problem. The exposure to diverse solutions is priceless.

My Idea and Approach

I learned a lot during this competition, especially about using Kaggle Notebooks, sharing datasets, and working with Google Generative AI libraries in Python.

My project aimed to create a personal communication coach using your own data. This data could include video recordings (like the ones I have on YouTube), blogs, notes, social media posts—basically, anything you’ve written or recorded.

To enrich my project, I also included four old research papers (Why did I add them? The story is in the next paragraph) I had written during my MS studies. (The story of how I didn’t complete my MS is a tale for another day!)

Personal communication coach with Gemini Long Context Window.

Frankly speaking, at first, I thought that using the transcripts from my YouTube videos would be enough to participate. However, when I started working with the cache context window, I realized that around 34k tokens were required. For reference, one token equals four characters, and despite having over 130 videos, it wasn’t enough to reach the required token count. Many of my videos were either silent or had very little spoken content.

To address this, I created more than 20 additional videos to increase the token count. Some of these videos were not listed publicly—they were old and new recordings where I talked about myself or practiced answering “Tell me about yourself.” One video, for instance, was related to my vlog commentary in English (I’m terrible at this!) where I showcased the Havelian landscape. I recorded it in English because, before the competition deadline, I happened to be in Havelian and found some time alone to film. I knew that I am out of transcripts words so let do a try to vlog in English. This overall effort allowed me to meet the minimum token requirements for the context window.

But then came another twist. When I checked the winning criteria, I realized the eligibility requirements demanded processing 100,000 tokens, as this was essential for the long-context challenge and the use of cache context. At this point, creating more videos was impossible given the tight deadline (just one day remaining). I had already spent significant time recording videos to reach 10k tokens, so producing enough content for 100k tokens in one day wasn’t feasible.

To meet the requirement, I got creative. I incorporated content from my blogs, some of my cover letters, resume summaries, and—finally—my old research papers. Collectively, these contributions pushed my project past the 100,000-token mark. But my journey didn’t end there!

After increasing the input size, I discovered that the Gemini Model (gemini-1.5-flash-001) was unable to recall the cache context consistently. It worked once or twice but failed most of the time. This was surprising for me. I searched for solutions to the issue, but nothing seemed to work. I tried different prompts with clear prefixes, such as “refer to my input file that I have given you earlier” or “based on the input file in the cache context,” but they didn’t work either.

Finally, I decided to switch to a different model—Gemini Model (gemini-1.5-flash-002)—and it worked! I’m still unsure what caused the problem with the earlier model.

Eventually, I was able to submit my project successfully, but there was a big mishap I discovered one day after submission.

The key takeaway here: always carefully check the competition requirements!

The Submission Mishap

After submitting my work, I initially thought everything was fine. But later, I discovered that my version 9 was submitted instead of version 11, which was my latest and most refined version. This was shocking and disappointing because my best work didn’t make it into the competition.

The Potential of Long Context Window Projects

Using your data, you can gain insights into recurring conversational issues, analyze your personality, and improve self-awareness. For example, my notebook explored how a personal communication coach could help individuals enhance their communication skills.

You can check out my notebook and video presentation (created just before the deadline) for more details. The competition provided a lot of learning material and showcased excellent work by participants.

As a product manager, I see immense potential in these long-context window projects. Many ideas presented here could evolve into market-ready products.

Exploring Other Use Cases

My use case was just the beginning. Here are some other ideas:

Bias Analysis: Using long-context data to uncover biases, such as analyzing media coverage of the Palestine crisis to identify ongoing biases. For example, analyzing videos from Piers Morgan’s YouTube channel could provide valuable insights.
Recruitment: Companies could use such data to identify the right candidates for jobs.
Language Proficiency: Language institutes could rate proficiency based on long-context analysis.
Self-Discovery: Individuals could uncover their strengths, values, and goals using their data.

The possibilities are endless.