Discussion about this post

User's avatar
Dorkwad's avatar

Also Steve, the AI's all know who you are. The companies have gotten better at reducing bias in their chatbots, so you can directly ask it how it can help you, Steve Sailer. For example:

----

Me:

Steve Sailer wants to use an AI for his work. How can AI help him? He needs accuracy. Hallucinations are fatal. So the AI is probably best to write programs for him, right? What programs? I'm thinking small scripts that you can mostly write one shot. He doesn't have time to learn programming in depth.

ChatGPT:

Yeah, small one-shot scripts are the right level. If hallucinations are fatal, then the AI shouldn't be used to generate facts or analysis directly—just tools. So the goal is: use AI to offload mechanical tasks, not judgment.

Here’s where AI-written programs could help someone like Sailer:

---

1. Text Analysis / Search Tools

These can be built once and reused indefinitely.

Keyword highlighters: Highlight specified terms in a document or website.

Concordance tools: Find all instances of a phrase across a corpus and show context.

Frequency counters: Count and rank word/phrase usage in documents.

Diff scripts: Compare versions of an article or webpage and highlight what's changed.

---

2. Web Automation

Useful for gathering structured data or monitoring sources.

Web scraping:

Pull census data, public health tables, or academic metadata.

Monitor specific websites for updates or changes.

RSS filtering:

Create a filter that forwards only items matching specified keywords.

Archiving:

Auto-save specified URLs to the Wayback Machine or local disk.

---

3. Data Cleanup and Reformatting

Make raw tables usable for Excel or R.

Convert PDFs to clean CSVs.

Remove footnotes, normalize weird Excel formats.

Detect and fix malformed data (dates, missing headers, etc.).

---

4. Spreadsheet Automation

Scripts that process data in consistent, pre-defined ways.

Generate pivot tables from raw data.

Cross-tabulate race vs. income vs. geography.

Apply specific formulas or conditional formatting automatically.

---

5. Visualization

Simple static plots (bar charts, maps) that avoid human charting overhead.

Plot population distributions.

Auto-generate line charts from public datasets.

Highlight outliers or percentiles.

---

6. Local Search Tools

If he has a big archive (e.g. PDFs, old blog posts, census tables):

Build a local full-text search with keyword ranking.

Tag and organize documents based on metadata.

---

If he doesn’t want to learn code at all, AI can generate small Python scripts on request—he just runs them. Ideally, he’d use something like:

Python + Jupyter Notebooks for one-off scripts (AI can generate these)

Streamlit or Gradio for simple interfaces (if needed)

He shouldn’t aim for general-purpose tools. Stick to tightly-scoped, single-use tasks: “I want to extract this table from this site and turn it into this format.”

---

Expand full comment
Scott Novak's avatar

The thinking/reasoning models are generally a bit more reliable, I've found. o3 mini and o3 mini high and o1 in your ChatGPT account are thinking models that may give better results. Claude Sonnet 3.7, Grok 3 in think mode, and Google Gemini models with "thinking" in the name should do better too. Perplexity has thinking models available as well. Grok is especially great at analyzing discussions going on in x/twitter and gemini models are great at summarizing Youtube videos you don't want to watch (since google owns youtube). With ChatGpt, when you are referencing data 2024 or later be sure to turn on the "web" icon so it has web search. As other people have mentioned,

ChatGPT, Grok and Perplexity have also released "Deep Thinking" modes that can give deeper summaries of various research topics. The ChatGPT deep thinking is by far the best as of now - you only get 10 a month uses with your $20 account. Try to prompt it as if you would if you were a researcher prompting a grad student to do a summary of relevant research, gaps in literature pointing to future research needed etc.

Also, if learning about AI latest developments is something you don't mind investing an hour of your life each week in, Zvi Mowshowitz' Don't Worry About the Vase substack releases a highly valuable weekly report every Thursday on the latest developments (yes it is moving that fast).

I tried your query in Google Gemini 2.5 Pro (experimental), which is an excellent thinking model released March 25th and is available to free users. It is possibly the best model widely available (as of this week!). Just sign up with a Gemini account with your gmail:

"Okay, here are the top ten MLB players for the 2024 season, ranked by Baseball Reference's Wins Above Replacement (bWAR):

Based on the final 2024 season statistics from Baseball Reference, here are the top ten MLB players ranked by their Wins Above Replacement (bWAR):

Aaron Judge (NYY) - 10.8 bWAR

Bobby Witt Jr. (KCR) - 9.4 bWAR

Shohei Ohtani (LAD) - 9.2 bWAR

Gunnar Henderson (BAL) - 9.1 bWAR

Jarren Duran (BOS) - 8.7 bWAR

Juan Soto (NYY) - 7.9 bWAR

Matt Chapman (SFG) - 7.1 bWAR

Francisco Lindor (NYM) - 6.9 bWAR

José Ramírez (CLE) - 6.8 bWAR

Ketel Marte (ARI) - 6.8 bWAR

Note: José Ramírez and Ketel Marte are tied for the 9th position.

Baseball Reference's WAR calculation aims to quantify a player's total contributions to their team (hitting, baserunning, fielding, and pitching) in one statistic."

Expand full comment
64 more comments...

No posts