I’m looking for an AI-driven tool that can sweep the entire internet and return a clean, verified list of email addresses for every shipping company operating in the United States. The crawler must be able to pull data not only from publicly available web pages but also tap into subscription-based and even our own in-house databases once credentials are provided. The identification logic should lean on company-name keywords such as “freight”, “logistics”, “carrier”, etc., and on the broader industry category, so that the tool reliably filters out businesses that aren’t genuine shippers. Geographical targeting can be inferred from contact pages or postal data, but the primary selectors are those name and industry cues. I’d like the finished program to: • Scan websites, online directories and any connected databases in real time, harvesting email fields. • De-duplicate entries and run basic deliverability checks (regex, SMTP ping or similar) before export. • Output CSV/Excel and offer an API or simple UI so my team can trigger fresh crawls on demand. • Come with concise documentation: setup, everyday use, and how to plug in additional data sources. Python would be ideal—think Scrapy, BeautifulSoup, Selenium or comparable libraries—but I’m open to other stacks if you already have proven code. Let me know how you plan to tackle large-scale crawling, the verification workflow you prefer, and any rate-limit or anti-bot countermeasures you will build in.