Spotlight → MCP

This morning I asked myself if I could make Spotlight on my Mac talk to Claude. Just a small experiment.

I ended up building a minimal MCP server that exposes Spotlight’s index—files, apps, recent items—as JSON-RPC tools. With that in place, Claude could search my folders, read files, and understand what a project is about.

I tested it on a real directory. It worked. Claude read through the files and summarised the purpose of the whole project in seconds. Something that would usually take me a while to piece together manually.

The whole thing took a few hours. Nothing fancy. But it opened an interesting door.

Here’s a quick demo:

PS: as usual, I didn’t write any code. In this case I was assisted by Claude. Which was kind of funny, we writing and testing the tool in the same thread. At some point I wrote “hey, now you can read files”, and it seemed pleased. ;)

Scraping Challenges and Open Standards

Following up what I posted recently about Scrape wars, I wrote a longer post for my company site. Reposting it here just for reference.

We’ve talked before about how everything you write should work as a prompt. Your content should be explicitly structured, easy for AI agents to read, interpret, and reuse. Yet, despite clear advantages, in practice we’re often stuck using workarounds and hacks to access valuable information.

Right now, many AI agents still rely on scraping websites. Scraping is messy, unreliable, and frankly a bit of a nightmare to maintain. It creates an adversarial relationship with companies who increasingly employ tools like robots.txt files, CAPTCHAs, or IP restrictions to block automated access. On top of that, major AI providers like OpenAI and Google are introducing built-in search capabilities within their ecosystems. While these are helpful, they ultimately risk creating a new layer of dependence. If content can only be efficiently accessed through these proprietary AI engines, we risk locking ourselves into another digital silo controlled by private platforms.

There is a simpler, proven, and immediately available solution: RSS. Providing your content via RSS feeds allows AI agents direct, structured access without complicated scraping. Our agents, for example, are already using structured XML reports from the Italian Parliament to effectively monitor parliamentary sessions. This is an ideal case of structured openness. Agents such as our Parliamentary Reporter Agent and the automated Assembly Report Agent thrive precisely because these datasets are publicly available, clearly structured, and easily machine-readable.

However, the reality isn’t always so positive. Other important legislative and governmental sites impose seemingly arbitrary restrictions. We regularly encounter ministries and other government websites that block access to automated tools or restrict access based on geographic location, even though their content is explicitly intended as public information. These decisions push us back into pointless workarounds or simply cut off access entirely, unacceptable when dealing with public information.

When considering concerns around giving AI models access to content, it’s essential to distinguish two different use cases clearly. One case is scraping or downloading massive amounts of data for training LLM models (this understandably raises concerns around copyright, control, and proper attribution). But another entirely different and increasingly crucial case is allowing AI agents access to content purely to provide immediate, useful services to users. In these scenarios, the AI is acting similarly to a traditional user, simply reading and delivering relevant, timely information rather than training on vast archives.

Building on RSS’s straightforwardness, we can take this concept further with more advanced open standards, such as MCP (Machine Content Protocol). Imagine a self-discovery mechanism similar to RSS feeds, but designed to handle richer, more complex datasets. MCP could offer AI agents direct ways to discover, interpret, and process deeper levels of information effortlessly, without the current challenges of scraping or the risk of vendor lock-in.

Of course, valid concerns exist about data protection and theft at scale (curiously the same concerns appeared back in the early RSS days, and even when the printing press first emerged… yet we survived). But if our primary goal is genuinely to share ideas and foster transparency, deliberately restricting access to information contradicts our intentions. Public information should remain public, open, and machine-readable.

Let’s avoid creating unnecessary barriers or new digital silos. Instead, let’s embrace standards like RSS and MCP, making sure AI agents are our partners, not adversaries, in building a more transparent and connected digital landscape.