Paolo’s Weblog | June 2025

The magic of AI search

I just built yet another MCP experiment.

First I created a Python script to process .md files: chunk them, create embeddings, store everything in a PostgreSQL database.

Then I built an MCP server which can search the database both using semantic search (embeddings) and more traditional full text search as a fallback mechanism.

I find absolutely fascinating watching Claude interacting with this tool, because it’s not just about converting my request to a query, it’s the reasoning process which happens in order to find what it needs which is brilliant.

Let me show you an example:

Building a WordPress MCP Server for Claude: Automating Blog Posts with AI

Building a custom MCP server to connect Claude directly to WordPress, enabling automated blog post creation with proper formatting and intelligent categorisation.

Third day, third MCP experiment (inspired by a quick conversation with Dave).

This time I connected Claude with this WordPress blog.

At the end of the chat that I used for the whole process of writing, building and installing the tool on my Mac, I asked Claude to write a post about the experience.

Of course I wouldn’t allow Claude to post unsupervised original stuff to my blog like I just did, but as Dave was pointing out, these are our new writing tools, being able to post directly without having to copy and paste just makes sense.

To be honest I would rather do this with ChatGPT, but apparently MCP integration is not available yet in the UK yet.

Check below to see Claude’s original post.

PS: it also categorised and tagged the post automagically ❤️

And here’s the recipe

I’m not confident enough in the tools I built this week to share them around just yet. As long as they run on my Mac, I’m happy, but I can’t really take responsibility for how they’d work for anyone else.

Still, while I’m not serving up the dish, I’m definitely happy to share the recipe!

If you plug this prompt into Claude or ChatGPT, you’ll get pretty close to what I’ve got running. Then ask how to build it and how to configure Claude and you should be good to go. Good luck, and let me know how it goes.

(I think that sharing prompts is an act of love.)

More MCP fun: Claude talks with ChatGPT

I started with a new idea this morning: create an MCP server that allows Claude to talk to the various OpenAI models.

Now I can ask Claude to ask any of the openAI models.

What I find more fascinating is how Claude is figuring out how to use these new tools. The key is in the description of the tool, the “manifest” that Claude gets when the server is initialised (and is probably injected at the beginning of every chat).

PS: if you want to try this at home, here’s the recipe.

As an example, here’s how the description of today’s MCP server looks like:

Spotlight → MCP

This morning I asked myself if I could make Spotlight on my Mac talk to Claude. Just a small experiment.

I ended up building a minimal MCP server that exposes Spotlight’s index—files, apps, recent items—as JSON-RPC tools. With that in place, Claude could search my folders, read files, and understand what a project is about.

I tested it on a real directory. It worked. Claude read through the files and summarised the purpose of the whole project in seconds. Something that would usually take me a while to piece together manually.

The whole thing took a few hours. Nothing fancy. But it opened an interesting door.

Here’s a quick demo:

PS: as usual, I didn’t write any code. In this case I was assisted by Claude. Which was kind of funny, we writing and testing the tool in the same thread. At some point I wrote “hey, now you can read files”, and it seemed pleased. ;)

Scraping Challenges and Open Standards

Following up what I posted recently about Scrape wars, I wrote a longer post for my company site. Reposting it here just for reference.

We’ve talked before about how everything you write should work as a prompt. Your content should be explicitly structured, easy for AI agents to read, interpret, and reuse. Yet, despite clear advantages, in practice we’re often stuck using workarounds and hacks to access valuable information.

Right now, many AI agents still rely on scraping websites. Scraping is messy, unreliable, and frankly a bit of a nightmare to maintain. It creates an adversarial relationship with companies who increasingly employ tools like robots.txt files, CAPTCHAs, or IP restrictions to block automated access. On top of that, major AI providers like OpenAI and Google are introducing built-in search capabilities within their ecosystems. While these are helpful, they ultimately risk creating a new layer of dependence. If content can only be efficiently accessed through these proprietary AI engines, we risk locking ourselves into another digital silo controlled by private platforms.

There is a simpler, proven, and immediately available solution: RSS. Providing your content via RSS feeds allows AI agents direct, structured access without complicated scraping. Our agents, for example, are already using structured XML reports from the Italian Parliament to effectively monitor parliamentary sessions. This is an ideal case of structured openness. Agents such as our Parliamentary Reporter Agent and the automated Assembly Report Agent thrive precisely because these datasets are publicly available, clearly structured, and easily machine-readable.

However, the reality isn’t always so positive. Other important legislative and governmental sites impose seemingly arbitrary restrictions. We regularly encounter ministries and other government websites that block access to automated tools or restrict access based on geographic location, even though their content is explicitly intended as public information. These decisions push us back into pointless workarounds or simply cut off access entirely, unacceptable when dealing with public information.

When considering concerns around giving AI models access to content, it’s essential to distinguish two different use cases clearly. One case is scraping or downloading massive amounts of data for training LLM models (this understandably raises concerns around copyright, control, and proper attribution). But another entirely different and increasingly crucial case is allowing AI agents access to content purely to provide immediate, useful services to users. In these scenarios, the AI is acting similarly to a traditional user, simply reading and delivering relevant, timely information rather than training on vast archives.

Building on RSS’s straightforwardness, we can take this concept further with more advanced open standards, such as MCP (Machine Content Protocol). Imagine a self-discovery mechanism similar to RSS feeds, but designed to handle richer, more complex datasets. MCP could offer AI agents direct ways to discover, interpret, and process deeper levels of information effortlessly, without the current challenges of scraping or the risk of vendor lock-in.

Of course, valid concerns exist about data protection and theft at scale (curiously the same concerns appeared back in the early RSS days, and even when the printing press first emerged… yet we survived). But if our primary goal is genuinely to share ideas and foster transparency, deliberately restricting access to information contradicts our intentions. Public information should remain public, open, and machine-readable.

Let’s avoid creating unnecessary barriers or new digital silos. Instead, let’s embrace standards like RSS and MCP, making sure AI agents are our partners, not adversaries, in building a more transparent and connected digital landscape.

M	T	W	T	F	S	S
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30