From Code to Concept: Discovering New Ideas with GitHub API and AI
Sometimes, when it’s raining outside, your favorite show has ended, and even games don’t feel inspiring, you start looking for something truly exciting to do — but you’re not sure what. Wasting that time can feel frustrating, especially when you know you have the motivation to create or explore. One great way to break out of that rut is by exploring GitHub — a goldmine for discovering new ideas, interesting projects, and clever implementations. In this short post, I’ll share how I use the GitHub API to search for inspiration and summarize what I find using AI.
Tools I Use
- GitHub API for fetching repositories.
meta-llama
or other LLMs to summarize README files.- Kaggle Notebook for code execution and result presentation.
To follow along, you’ll need a GitHub access token. You can generate one .
Step 1: Search GitHub Repositories
We start by crafting a query to search for repositories related to a specific topic (in this case, "web scraping"
) updated in the last 14 days and written in Python:
import requests
import urllib.parse
from datetime import datetime, timedelta
BASE_URL = "http://api.github.com"
search_text = "web scraping"
SEARCH_QUERY = f'"{search_text}" in:readme,description,topics pushed:>={(datetime.now() - timedelta(days=14)).strftime("%Y-%m-%d")} language:Python'
HEADERS = {"Authorization": f"token {github_token}"}
def get_list_of_repos():
params = {
"q": SEARCH_QUERY,
"sort": "stars",
"order": "desc",
"per_page": 50
}
search_url = f"{BASE_URL}/search/repositories?q={urllib.parse.urlencode(params)}"
response = requests.get(search_url, headers=HEADERS)
return response.json()
json_data = get_list_of_repos()
From the API response, I extract just a few fields:
html_url
description
avatar_url
name
size
watchers_count
These are enough to get a quick sense of what a repo is about.
Step 2: Get the README and Summarize It
Next, we fetch the README for each repo and summarize it using an AI model:
import base64
def get_readme_data(repo_url):
r = requests.get(repo_url, headers=HEADERS)
if r.status_code == 200:
content = r.json().get("content", "")
return base64.b64decode(content).decode("utf-8", errors="ignore")
return None
Now, let’s use the LLM to generate a 3–4 sentence summary:
MODEL = "meta-llama/llama-4-maverick:free"
def summarize_text_with_model(message):
payload = {
"model": MODEL,
"messages": [{
"role": "user",
"content": [{"type": "text", "text": message}]
}]
}
headers = {
"Authorization": f"Bearer {ai_token}",
"Content-Type": "application/json"
}
response = requests.post("http://openrouter.ai/api/v1/chat/completions", headers=headers, json=payload)
try:
return response.json()["choices"][0]["message"]["content"]
except Exception as err:
print(err)
return None
Step 3: Collect and Render the Results
Now we loop through the search results, summarize, and build the data list:
res = []
for el in json_data["items"]:
repo_url = f"{BASE_URL}/repos/{el['full_name']}/readme"
readme_text = get_readme_data(repo_url)
if readme_text:
prompt = f"Summarize the following GitHub README in 3-4 sentences:\n\n{readme_text}"
summarize_text = summarize_text_with_model(prompt)
res.append({
"repo_url": el["html_url"],
"repo_description": el["description"],
"summarize_text": summarize_text,
"owner_avatar_url": el["owner"]["avatar_url"],
"name": el["name"],
"size": el["size"],
"watchers_count": el["watchers_count"]
})
Step 4: Display the Results Nicely
html = """
<table style="
border-collapse: collapse;
width: 100%;
font-family: Arial, sans-serif;
font-size: 14px;
background-color: #e3cfa1;
color: #807c0e;
">
<thead style="background-color: #f2f2f2;">
<tr>
<th style="padding: 10px; border: 1px solid #ddd;">Avatar</th>
<th style="padding: 10px; border: 1px solid #ddd;">Repository</th>
<th style="padding: 10px; border: 1px solid #ddd;">Size (KB)</th>
<th style="padding: 10px; border: 1px solid #ddd;">Watchers</th>
<th style="padding: 10px; border: 1px solid #ddd;">Description</th>
<th style="padding: 10px; border: 1px solid #ddd;">Summary</th>
</tr>
</thead>
<tbody>
"""
for item in res:
html += f"""
<tr>
<td style="padding: 10px; border: 1px solid #ddd; text-align: center;">
<img src="{item['owner_avatar_url']}" width="50" height="50" style="border-radius: 50%;">
</td>
<td style="padding: 10px; border: 1px solid #ddd;">
<a href="{item['repo_url']}" target="_blank"><b>{item['name']}</b></a>
</td>
<td style="padding: 10px; border: 1px solid #ddd; text-align: center;">{item['size']}</td>
<td style="padding: 10px; border: 1px solid #ddd; text-align: center;">{item['watchers_count']}</td>
<td style="padding: 10px; border: 1px solid #ddd; max-width: 300px; word-wrap: break-word;">
{item['repo_description']}
</td>
<td style="padding: 10px; border: 1px solid #ddd; max-width: 400px; word-wrap: break-word;">
{item['summarize_text']}
</td>
</tr>
"""
html += """
</tbody>
</table>
"""
display(HTML(html))
and the results looks like:
Step 5: Save the Results
Finally, we save the results in HTML, JSON, and CSV formats, with filenames that include today’s date:
from datetime import datetime
import pandas as pd
import json
# Save HTML
with open(f"{datetime.now():%Y-%m-%d}_github_results.html", "w", encoding="utf-8") as f:
f.write(html)
# Save JSON
with open(f"{datetime.now():%Y-%m-%d}_github_results.json", "w", encoding="utf-8") as f:
json.dump(res, f, ensure_ascii=False, indent=4)
# Save CSV
df = pd.DataFrame(res)
df.to_csv(f"{datetime.now():%Y-%m-%d}_github_results.csv", index=False, encoding="utf-8")
Final Thoughts
This approach combines the power of the GitHub API with the flexibility of LLMs to quickly surface new ideas, tools, and inspirations — especially when you’re in need of a creative push. Whether you’re hunting for open-source tools or just exploring what the dev community is building, this workflow helps turn code into concepts at scale.