Ethical Scraping vs Buying Lead Lists: Which Actually Works in 2026?
Cost, lead quality, LGPD/GDPR risk, and scalability compared. Why ethical scraping of public data beats purchased lists on every dimension that matters.
Equipe Sirius CRM
Editor
Every B2B sales team eventually faces the same question: build your own prospect lists through research, or buy a ready-made contact database? In 2026, with Brazil's LGPD consolidated and data quality problems endemic in purchased lists, the answer has become clearer — but the nuances still matter.
This guide breaks down both approaches across the dimensions that actually determine ROI: lead quality, cost per qualified meeting, regulatory risk, and scalability.
What "Ethical Scraping" Actually Means
The term "scraping" carries negative connotations — images of bots harvesting private data, violating terms of service, and enabling spam. Ethical scraping is fundamentally different: it means systematically collecting data that is already publicly available and intended for business contact.
Examples of ethical scraping sources for Brazilian B2B prospecting:
- Receita Federal CNPJ database — public by law. Every registered Brazilian company, with CNAE code, address, size, registration status, and often a registered phone number
- Google Maps / Google My Business — businesses that listed themselves publicly, including contact information they chose to publish
- LinkedIn public profiles — information professionals chose to make public
- Company websites — contact pages, leadership pages, and press releases companies published for the purpose of being contacted
The LGPD (Brazil's data protection law) specifically permits processing of data that was "manifestly made public by the holder." This creates a legal basis for using publicly available business contact information for B2B outreach — provided you have a legitimate interest and offer an easy opt-out.
What Purchased Lists Actually Contain
Lead list vendors promise targeted, verified, up-to-date contact data. The reality is frequently different:
Common problems with purchased lists:
- Stale data — most vendors refresh quarterly or annually. Your "fresh" list may contain contacts who left their company 6 months ago
- Origin opacity — you often don't know how the data was collected. If it came from a scraping operation that violated ToS, you inherit the regulatory risk
- Shared lists — the same contacts in your "exclusive" list are likely also in dozens of competitors' lists. Response rates plummet when a prospect has been contacted by 10 vendors from the same list
- Low ICP fit — generic lists optimized for volume, not your specific ideal customer profile
- LGPD exposure — if the list vendor didn't obtain data lawfully, using it exposes your company to ANPD enforcement
Cost Comparison: Real Numbers
| Metric | Purchased List | Ethical Scraping |
|---|---|---|
| Upfront cost | R$500–5,000 per list | R$0–300/month (tools) |
| Data freshness | Quarterly/annual refresh | Real-time (on-demand) |
| Valid email rate | 60–75% | 85–95% |
| ICP fit (your criteria) | 30–50% | 80–95% (you define filters) |
| Regulatory risk (LGPD) | Medium-High (unknown origin) | Low (public data basis) |
| Email deliverability impact | High bounce → domain damage | Lower bounce rate |
| Competition for same contacts | High (resold to many buyers) | Low (your proprietary list) |
Ethical Scraping Tools for Brazilian B2B
The practical stack for building high-quality prospect lists through ethical data collection:
- ReceitaWS (free tier available) — query CNPJ in bulk by CNAE code, state, city, company size. Returns registered address, contact info, and company status
- Google Maps scraping (via tools like Outscraper, or manual) — businesses in a geography by category. Useful for territory-based prospecting
- LinkedIn Sales Navigator — decision-maker mapping within companies identified through CNPJ data
- Hunter.io / Apollo.io — find business email patterns for companies you've identified through other sources
- BuiltWith / Wappalyzer — identify companies using specific technologies (useful for software sales targeting)
The Hybrid Approach: Best of Both Worlds
The false dichotomy is "list vs scraping." The most effective approach combines both strategically:
- Use CNPJ/public data to build your universe of target companies (ICP fit, geography, size, industry)
- Use LinkedIn to identify decision-makers within those companies (title, seniority, tenure)
- Use email finders to verify contact information for high-priority targets
- Purchase lists only for event-based targeting (e.g., companies that just raised funding, recent hires in your buyer's role) where timeliness is critical and you can verify the list's data origin
The companies that win in B2B prospecting in 2026 are those that treat their prospect list as a proprietary asset — built systematically, enriched continuously, and managed in a CRM like Sirius where every touch is logged and every follow-up is tracked.
LGPD Compliance Checklist for B2B Outreach
- Data source is documented (public registry, company website, LinkedIn public profile)
- You have a legitimate interest basis documented (B2B prospecting for relevant products/services)
- Every outreach includes a clear, easy opt-out ("Reply STOP to unsubscribe")
- Opt-outs are processed within 15 days and the contact is removed from all sequences
- Data is not shared with third parties without consent
- You don't contact individuals using personal data (CPF, personal email) — only business contacts
FAQ
Is CNPJ data scraping legal under LGPD?
Yes, when used correctly. CNPJ data is a public registry that companies are legally required to register. LGPD's legitimate interest basis (Art. 10) and the public data exception (Art. 7, VII) both support using this data for B2B outreach. The key requirements: only use for the purpose it was collected (business contact), honor opt-outs promptly, and don't use personal data of individuals beyond their professional role.
What's the realistic ROI difference between scraping and purchased lists?
Based on typical conversion benchmarks: ethical scraping generates 2-3x more qualified meetings per 1,000 contacts reached versus purchased lists, primarily due to higher ICP fit and lower competition for the same contacts. The setup time investment (typically 2-4 hours to configure tools and filters) pays back within the first 20 meetings booked.
How does ethical scraping compare to LinkedIn Sales Navigator?
They're complementary, not competing. Sales Navigator excels at mapping decision-makers within companies (role, seniority, job history, activity signals). Ethical CNPJ/Google Maps scraping excels at building the universe of target companies (industry, size, geography, registration status). The ideal stack: use CNPJ data to identify your ICP companies, then use LinkedIn to find the right person within each company.
Conclusion
The choice between buying lists and ethical scraping isn't just about cost — it's a strategic decision with direct impact on lead quality, domain reputation, regulatory risk, and process scalability.
In 2026, with LGPD consolidated and ethical scraping tools increasingly accessible, the math has shifted decisively toward public-data prospecting. The initial investment in setting up an ethical scraping process pays back quickly through higher-quality leads, fresher data, and zero regulatory exposure.
Leia tamb\u00e9m
Receba dicas de vendas e CRM por email
Conteudo exclusivo sobre prospeccao, automacao e IA para vendedores B2B. Sem spam — 1 email por semana.
Leia também
Continue aprendendo sobre vendas e gestão comercial