Reality of exposure

The exposure audit: 2 hours to map what's leaking

Step-by-step method to inventory what is already public about you. Free tools, priority order, decisions to make.

Published October 3, 2024 15 min read General

Last reviewed: January 15, 2026

This version was translated with AI assistance and reviewed by a human.

Serveurs de données en rangées dans un datacenter

An M&A lawyer asks me, between two coffees, what someone could find on him in two hours. I start a timer. Two hours later he is staring at his 2019 home address, three former mobile numbers, the full list of his board mandates since 2012, and a dump containing his Gmail password from that era, in cleartext. He went pale. He didn’t know. Nobody knows, until they look.

Angle de lecture

Particulier RSSI / DSI

The usual trap

“I checked, I’m clean. Nothing comes up when I Google myself.” I hear this almost every week, and it means nothing. Googling your own name surfaces the polished, indexed, recent surface of your existence: your LinkedIn headline, a conference bio, maybe a press quote. It tells you what a lazy stranger sees in thirty seconds. It tells you nothing about what a motivated one assembles in an afternoon.

The dominant advice — “do an ego search, set up a Google Alert, you’re covered” — fails for a structural reason. The open web that Google indexes is a thin film over a much deeper stack: breach dumps traded in closed channels, broker databases sold by the record, legal registries that publish your home address by statute, archived snapshots of pages you deleted years ago. None of that is in the first page of search results. A surface search returns the 0.1% of your exposure that happens to be both public and indexed, and it returns it in a reassuring shape. That reassurance is the danger. People walk away thinking they looked, when all they did was glance at the part designed to be seen.

An audit is the opposite of a glance. It is structured, exhaustive within a fixed scope, and reproducible. You decide in advance what surfaces you will check, in what order, with what tools, and you write down everything you find — not just the alarming items, the boring ones too, because the boring ones are what an adversary correlates. OSINT done on yourself is not paranoia; it is the only way to see your own attack surface the way someone hunting you sees it. The goal of the next two hours is not to feel better. It is to produce a list of facts and, for each one, a decision.

Why two hours, and why in that order

Two hours is not arbitrary. It is roughly the budget a competent investigator spends on a mid-value target before they decide whether to dig deeper or move on. If you can reproduce what they find in that window, you know your baseline exposure. Spend less and you skip layers; spend more and you hit diminishing returns that only a paid professional engagement justifies.

The order matters more than the tools. Run it in four phases of thirty minutes each: public surfaces, leaks and credentials, registries and legal sources, social and metadata. Each phase feeds the next. The email addresses you confirm in phase one become the queries you run in phase two. The company names you find in phase three sharpen your social searches in phase four. Auditing out of order means re-running searches with incomplete inputs, which is exactly how people miss the old Hotmail address from 2004 that turns out to be the most exposed identifier they own.

Before you start, build your input list. Twenty minutes, no tools, just memory and old password manager exports: every email you have used regularly, every phone number including lapsed SIMs, every username and pseudonym, every name variant (maiden name, middle-name combinations, anglicized spellings), every current and former address, every company you have been a director or beneficial owner of. The completeness of this list determines the quality of everything downstream. An audit run against three identifiers finds three identifiers’ worth of exposure; the fourth one you forgot is the one that hurts.

Phase 1 — Public surfaces (30 min)

Open an isolated browser session first: a private window logged into nothing, or better, a dedicated profile with no saved accounts. You do not want your audit queries personalized by your own logged-in history, and you do not want them attached to your normal account. This matters most in phase four, where logged-in LinkedIn behaves completely differently from logged-out.

Then run structured queries — Google dorks — instead of plain name searches. The operators do the work:

site:linkedin.com "First Last" — confirms exactly what your LinkedIn exposes to the open web
"First Last" filetype:pdf — CVs, conference rosters, board minutes, court filings, slide decks that name you
"First Last" filetype:xlsx OR filetype:csv — spreadsheets that leaked a contact list with your details
"your@email.com" in quotes — pages that explicitly display that address
site:web.archive.org "First Last" — archived pages, including deleted ones
"First Last" -site:linkedin.com -site:facebook.com — strips the obvious profiles so the long tail surfaces

Work every name variant and every email through these. The operators are not a parlour trick — each one targets a specific failure mode of plain search. filetype:pdf finds you in documents nobody indexed deliberately: a board pack uploaded to a public folder, a conference attendee list, a court filing scanned and posted. The quoted-email query finds the pages that printed your address in plaintext, which is how scrapers built the broker record you’ll meet in phase three. Run them patiently; the value is in the third page of results, not the first, because the first page is the surface you already knew about.

Open the Wayback Machine directly and search any old personal domain, blog, or former-employer page you can remember — deletions are rarely archived, but the snapshot taken before you deleted persists indefinitely. I have lost count of the times a client swore a page was gone and the archive served it back in two clicks, home address and all. Finally, run your main profile photo through reverse image search on Yandex, TinEye and Google Images; the three return different results, and Yandex in particular surfaces republished copies and old accounts that the others miss. Expected findings at the end of phase one: a near-complete professional profile, one or more past addresses, your mandate history, and any press or conference indexing.

Phase 2 — Leaks and credentials (30 min)

Now the part that actually moves the risk needle. Run every historical email from your input list through HIBP (Have I Been Pwned). Do not stop at “yes/no” — record which breaches each address appears in, the year, and the data classes exposed. A 2016 gaming-forum breach that leaked only a username is a footnote. A 2021 breach that leaked your email next to a cleartext password is a live wound, because that password gets tested against your accounts by automated credential-stuffing tools whether the breach is one year old or six.

HIBP indexes publicly disclosed breaches — the announced, press-covered, researcher-submitted subset. It does not index the private market, closed Telegram channels, or the enriched dossiers that brokers assemble. So HIBP returning zero does not mean clean; it means no documented exposure in the public slice of a much larger universe. To probe further: DeHashed (partial free tier, ~$20 for a month you cancel after) indexes cleartext and hashed passwords, usernames, IPs and phone numbers — run emails, numbers and pseudonyms, then export. Intelligence X searches breach data, paste sites and document leaks. For a genuinely sensitive profile, Constella Intelligence adds non-English and corporate-credential coverage that the free tools lack.

Read the results as a pattern, not a checklist. What you are actually mapping is reuse: the same password, or a predictable variation of it, appearing across several breaches over several years. That pattern is what turns a single old leak into a present-day account-takeover path, because an attacker who has your 2015 password guesses your 2026 one in three attempts. Note also the data classes — an address exposed in a 2020 breach is a physical-risk fact even if the password is long since rotated, because the address didn’t change. The number of breaches matters less than this: how recent, what was in them, and whether the credentials still pattern-match anything you use. Write the breach name and year next to each finding; you will want the provenance when you decide what to do.

Phase 3 — Registries and legal sources (30 min)

This is the layer almost everyone underestimates, because it is not a leak and not a hack — it is the law publishing your data on purpose. Your professional history is permanently archived in public registries you cannot edit, and they are frequently more revealing than any breach dump.

Run your full name through the corporate registry for your jurisdiction. In France, Pappers and BODACC expose every board mandate, filed accounts, insolvency proceedings and legal notices, often with the registered address you used at appointment. In the UK, Companies House lists every directorship past and present, with historical filings preserving the home address you gave at the time. The INPI (and equivalents) index trademarks and patents in your name. For anyone with an international footprint, add OpenCorporates (aggregating 140+ jurisdictions), the US PACER for federal court records, and SEC EDGAR if you have ever touched a registered entity. Property and electoral records round it out: in much of the US and UK these are public, and in the UK the open electoral register is sold wholesale to data brokers — which is precisely how your name and home address land in commercial databases without your consent.

A concrete example of why this layer outranks the breach panic: I once audited a founder who had spent money scrubbing himself from people-search sites and felt secure. Phase three found his current home address in a corporate filing from when he registered his first company at his apartment, eleven years earlier. No broker, no breach — the state had published it, indexed it, and would serve it to anyone who typed his name into the registry for free. The scrubbing was real work aimed at the wrong layer.

The point of phase three is not just to find these records but to understand they are structurally undeletable. You can sometimes suppress a derived broker copy; you almost never remove the statutory source. That distinction drives your decisions later: for a registry fact, the realistic options are accept-and-monitor or mutate-going-forward, rarely remove. Mutating means changing the future inputs — a registered-office service instead of your home, a dedicated forwarding address on new filings — so that next year’s published record points somewhere harmless. The old filing stays; you stop adding to the trail.

Test your LinkedIn profile from the logged-out, isolated session you opened in phase one. What a non-connection sees is very different from your own logged-in view: note exactly which fields, connections and activity are visible to a stranger, because that is the version an adversary reads. Do the same for X/Twitter, where a decade of public posts is searchable and where old replies often reveal location, routine and relationships you have long forgotten posting.

Then go after metadata. Download a few images you have published — profile photos, event pictures, anything you uploaded directly rather than through a platform that strips it — and inspect their EXIF data with ExifTool or an online viewer. GPS coordinates, camera serial numbers and capture timestamps routinely survive in files posted to personal sites, forums and some messaging exports. The major social platforms strip EXIF on upload, which lulls people into thinking it never matters — but your own website, a club gallery, a real-estate listing, or a forum attachment usually keeps it intact. A single geotagged photo of your kitchen, posted once to a hobby forum, can hand over a home address that you scrubbed everywhere else. Close the phase with a Google Images search on your full name to catch any photo you missed, and note every face that is actually you — including the ones a stranger could plausibly tie to you through a caption, a shared event, or a tagged friend.

What it means in practice

For you, as a person

You want to know what a recruiter, an adversary, or an ex-partner can pull together on you in an hour. Do it yourself first. Three priorities, all doable this week, all under €200 (most of it free):

Run HIBP on every address you’ve ever used — not just your current one. The 2008 webmail account you abandoned is usually your most exposed identifier. Where you find a cleartext password, rotate any account still using a variant of it and switch on MFA.
Open LinkedIn in a private window and read your own profile as a stranger — write down exactly what’s visible logged-out, then tighten your public-visibility settings to match what you’re actually comfortable handing to someone hunting you.
Check the corporate registry if you hold or held any mandate — Pappers/BODACC in France, Companies House in the UK. If your home address is published there from an old filing, file the change of registered address; you can’t delete the history, but you can stop leaking your current home.

For you, CISO / IT director / executive

1. Audit your key people before your adversaries do. Treat the exposure of your executive committee and sensitive-function staff (finance, M&A, legal, anyone who can authorize a wire) as a defensive-intelligence task, not a compliance checkbox. Direct consequence: you find the cleartext executive credential or the published home address before it becomes the entry point for a spear-phishing or CEO-fraud attempt.

2. Use fresh, external eyes. A person cannot audit their own exposure objectively — they unconsciously skip the addresses and aliases they’d rather not think about. Budget 2-4 hours per key person, run by a third party, twice a year. Direct consequence: the audit catches the long-tail identifiers your own people forget, which are exactly the ones an attacker correlates.

3. Feed findings into your threat model, not a report that gets filed. Each critical finding should map to a control: credential rotation, address suppression, monitoring, or an accepted-risk response plan. Direct consequence: exposure becomes a tracked, decaying metric instead of an annual fright.

Formalizing the results

Findings you don’t write down are findings you’ll re-discover from scratch in six months. Build one table, four to six columns, and fill a row for every single thing you found — alarming or not:

Information	Source	Date	Criticality	Decision
your@email.com + password	DeHashed / Breach X	2019	Critical	Rotate, check linked accounts
+33 6 00 00 00 00	Broker, electoral roll	Unknown	Sensitive	Accept, monitor
2019 home address	Archived firm bio (Wayback)	2019	Critical	Remove if possible, else mutate
forum-handle-2014	Forum archive	2014	Benign	Accept

Four criticality levels: critical (enables account takeover, identity theft, or physical risk — address plus routine plus schedule); sensitive (professionally or personally damaging in the wrong context); public (already known and indexed, no harm in its current state); benign (technically exposed, practically irrelevant). For every critical row you need a decision before you close the audit — not “I’ll think about it,” an actual choice from three: mutate the identifier going forward, remove the data where a real mechanism exists, or accept it with a written response plan. An undecided critical finding is worse than an unknown one, because now you know and did nothing.

Then schedule the next pass. Six months for a standard profile, three months for an exposed one. Your exposure surface is not static: new breaches land, brokers buy new records, registries publish new filings. An audit dated January 2026 describes January 2026. Put the next date in the calendar before you close the file.

Mistakes we see all the time

One audit, considered done forever. Exposure compounds continuously. A single pass is a snapshot, not a state. No recurring date means no audit, just a memory.
Checking only the current email. The address you stopped using in 2017 is still in breach databases and still tied to forgotten accounts. Old identifiers are usually more exposed, because hygiene around them lapsed years ago.
Forgetting aliases and pseudonyms. A forum handle you used for years, never knowingly tied to your name, may already be cross-referenced. Run every alias as a first-class query.
Skipping the registries. People obsess over breach dumps and ignore that a corporate registry or property record publishes your home address by statute — often the single most dangerous fact in the whole table.
Collecting without deciding. A list of findings with no decision column is anxiety, not security. The value is in the decision, not the discovery.
Auditing yourself and trusting the result. You will skip your own uncomfortable identifiers without noticing. For anything that matters, get a second pair of eyes.

Actionable checklist

N1 Check all historical email addresses on HIBP
N1 Google yourself with operators (site:, filetype:, intitle:) on every name variant
N1 Test your LinkedIn profile in a private window — write down what's visible logged-out
N2 Check the corporate registry (Pappers/BODACC, Companies House) for any mandate
N2 Run reverse image search on your main profile photo (Yandex, TinEye, Google Images)
N2 Check the Wayback Machine on old domains, blogs and former-employer pages
N2 Inspect EXIF metadata on images you've published directly
N3 Formalize findings in a criticality table and decide for each critical item: mutate, remove, or accept
N3 For an exposed profile, commission a third party for fresh eyes (2h, €100-300) and set the re-audit date

Going further

The tooling in this guide is the free or low-cost tier of what professionals use: OSINT consolidators like Maltego Community Edition map relationships between your findings when the table gets large, and HIBP plus the Wayback Machine cover most of what you need for a self-audit. The official Google advanced-search operators reference is worth ten minutes — half the value of phase one is knowing the operators exist. If you want the regulatory backdrop on why your data circulates so freely, the CNIL’s annual reports document the enforcement reality in the EU. And once you’ve mapped your exposure, the natural next reads are how this data was already public before any breach, how to compartment your identity so future leaks hurt less, and how the broker industry monetizes the trail.

Sources and further reading

Have I Been Pwned [official]
Maltego Community Edition [official]
Google Advanced Search operators [official]
Wayback Machine [official]
CNIL — Annual report 2023 [official]