HIBP Downloader documentation
This is a CLI tool to efficiently download a local copy of the pwned password hash data from the very awesome HIBP pwned passwords api-endpoint using multiprocessing, async-processes, local-caching, content-etags and http2-connection pooling to make things as fast as (seems) Pythonly possible.
Features
- Only download hash-prefix content blocks when the hash-prefix block content has changed.
- Start, stop and re-start the data-collection process without loss of data already collected.
- Ability to query clear text values and return results from the pwned password data set.
- Generate a single text file with pwned password hash values in-order, similar to PwnedPasswordsDownloader from the HIBP team.
- Per prefix file metadata in JSON format for easy data reuse.
Install
pip install --upgrade hibp-downloader
Usage
Performance
Sample download activity log; host with 12 cores on 45Mbit/s DSL connection.
2023-07-31T03:22:45+1000 | INFO | hibp-downloader | prefix=e585f source=[lc:265201 et:0 rc:722148 ro:3 xx:0] runtime_rate=[11.2MBit/s 86req/s ~71005H/s] runtime=2.33hr download=11748.0MB
2023-07-31T03:22:48+1000 | INFO | hibp-downloader | prefix=e5877 source=[lc:265201 et:0 rc:722268 ro:3 xx:0] runtime_rate=[11.2MBit/s 86req/s ~70998H/s] runtime=2.33hr download=11750.0MB
2023-07-31T03:22:50+1000 | INFO | hibp-downloader | prefix=f5837 source=[lc:265201 et:0 rc:722388 ro:3 xx:0] runtime_rate=[11.2MBit/s 86req/s ~70992H/s] runtime=2.33hr download=11751.9MB
- 86 requests per second to
api.pwnedpasswords.com
- 265,201 prefix files from (
lc
) local-cache; 722,388 from (rc
) remote-cache; 3 from (ro
) remote-origin; 0 failed (xx
) download - estimated ~70k hash values downloaded per second
- 11.5GB (11,751MB) downloaded in 2.3 hours (full dataset is ~3.5 hours)
Project
- Github - github.com/threatpatrols/hibp-downloader
- PyPI - pypi.org/project/hibp-downloader/
- ReadTheDocs - hibp-downloader.readthedocs.io
Copyright
- Copyright © 2023 Threat Patrols Pty Ltd
- Copyright © 2023 Nicholas de Jong
All rights reserved.
License
- BSD-3-Clause - see LICENSE file for details.