Subdomain
stringlengths
1
245
Count
int64
1
68.1M
www
19,447,503
mail
2,238,910
webmail
2,169,487
cpanel
1,900,299
webdisk
1,850,941
cpcalendars
1,577,226
cpcontacts
1,565,113
autodiscover
1,189,922
ww12
342,100
lawfilter
242,689
smtp
215,082
pop
182,546
ftp
163,689
m
143,312
api
137,009
ww25
128,449
imap
124,297
app
122,871
dev
121,690
test
114,003
t
103,670
blog
103,595
en
93,770
autoconfig
91,175
shop
85,483
admin
84,957
staging
81,843
pay
81,760
apps
68,105
cloud
59,785
accounts
59,094
login
56,735
demo
56,446
de
52,132
support
47,358
ww38
46,370
aaa
45,743
whm
42,034
link
41,967
seguro
41,550
cdn
40,947
www.test
37,887
new
37,862
portal
37,765
es
33,339
ww16
32,048
email
31,903
www.mail
31,522
domains
31,330
sitemap
30,663
sell
30,363
crm
30,324
fr
30,234
www.dev
29,859
media
29,740
store
29,157
inst
29,146
old
28,239
info
27,925
docs
27,358
sitemaps
27,203
go
26,282
beta
24,553
cpanelemaildiscovery
23,569
git
22,562
server
22,475
wiki
22,209
chat
22,192
web
22,064
www.www
21,452
auth
21,208
status
20,519
mta-sts
20,450
my
20,306
static
19,928
wap
19,626
www6
18,988
forum
18,903
home
18,263
help
18,102
stage
17,723
www.staging
17,706
dashboard
17,643
gitlab
17,596
remote
17,529
secure
17,327
www.demo
17,325
lp
17,102
vpn
16,853
pixel
16,362
cms
15,756
www.blog
15,671
panel
15,636
img
15,112
portfolio
15,051
nextcloud
14,622
www.app
14,519
www.admin
14,393
www.shop
14,242
www.api
14,221

Dataset Card for Subdomain Statistics from scanner.ducks.party

Dataset Summary

This dataset contains monthly archives of subdomain statistics for websites seen by the scanner.ducks.party web bot. The data is provided in CSV format, with each archive containing two columns: Subdomain and Count.

Languages

The dataset is primarily in English, with subdomains potentially in multiple languages.

Dataset Structure

Data Fields

  • Subdomain: The subdomain name (string)
  • Count: The number of occurrences (integer)

Data Splits

Monthly archives are provided as separate files.

Additional Information

Source Data

The data is collected only from websites from which the bot received a response. This means that the data might not represent the entire internet, but rather a subset of it. The bot is located in Russia, and the data is affected by local network conditions and filtering.

Caveats and Recommendations

  • The 'lawfilter' subdomain is used by censors and does not accurately reflect subdomain trends in the global internet.
  • Data collection is subject to local network conditions and filtering in Russia.

Monthly Statistics

  • November 2023 - Unique subdomains: 13,588,987 - Total count: 63,819,447
  • December 2023 - Unique subdomains: 21,207,666 - Total count: 141,514,137
  • January 2024 - Unique subdomains: 6,586,148 - Total count: 37,016,016
  • February 2024 - Unique subdomains: 18,846,254 - Total count: 134,140,761
  • March 2024 - Unique subdomains: 20,083,773 - Total count: 199,922,396
  • April 2024 - Unique subdomains: 19,862,758 - Total count: 196,112,502
  • May 2024 - Unique subdomains: 20,805,517 - Total count: 207,351,437
  • June 2024 - Unique subdomains: 17,607,353 - Total count: 140,782,717
  • July 2024 - Unique subdomains: 20,146,823 - Total count: 228,186,769
  • August 2024 - Unique subdomains: 15,124,494 - Total count: 155,858,868
  • September 2024 - Unique subdomains: 19,191,867 - Total count: 170,792,927

License

This dataset is dedicated to the public domain under the Creative Commons Zero (CC0) license. This means you can:

  • Use it for any purpose, including commercial projects.
  • Modify it however you like.
  • Distribute it without asking permission.

No attribution is required, but it's always appreciated!

CC0 license: https://creativecommons.org/publicdomain/zero/1.0/deed.en

To learn more about CC0, visit the Creative Commons website: https://creativecommons.org/publicdomain/zero/1.0/

Dataset Curators

Downloads last month
2
Edit dataset card