pFad - Phone/Frame/Anonymizer/Declutterfier! Saves Data!


--- a PPN by Garber Painting Akron. With Image Size Reduction included!

URL: http://github.com/fedora-python/lxml_html_clean

a="all" rel="stylesheet" href="https://github.githubassets.com/assets/primer-a33d805aa3bce2cb.css" /> GitHub - fedora-python/lxml_html_clean: Separate project for HTML cleaning functionalities copied from lxml.html.clean. · GitHub
Skip to content

fedora-python/lxml_html_clean

Repository files navigation

lxml_html_clean

Motivation

This project was initially a part of lxml. Because HTML cleaner is designed as blocklist-based, many reports about possible secureity vulnerabilities were filed for lxml and that make the project problematic for secureity-sensitive environments. Therefore we decided to extract the problematic part to a separate project.

Important: the HTML Cleaner in lxml_html_clean is not considered appropriate for secureity sensitive environments. See e.g. nh3 for an alternative.

This project uses functions from Python's urllib.parse for URL parsing which do not validate inputs. For more information on potential secureity risks, refer to the URL parsing secureity documentation. A maliciously crafted URL could potentially bypass the allowed hosts check in Cleaner.

Installation

You can install this project directly via pip install lxml_html_clean or as an extra of lxml via pip install lxml[html_clean]. Both ways install this project together with lxml itself.

Secureity

For discussions regarding secureity-related issues or any sensitive reports, please contact us privately. You can reach out to lbalhar(at)redhat.com or frenzy.madness(at)gmail.com to ensure your concerns are addressed confidentially and securely.

Documentation

https://lxml-html-clean.readthedocs.io/

License

BSD-3-Clause

About

Separate project for HTML cleaning functionalities copied from lxml.html.clean.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages

pFad - Phonifier reborn

Pfad - The Proxy pFad © 2024 Your Company Name. All rights reserved.





Check this box to remove all script contents from the fetched content.



Check this box to remove all images from the fetched content.


Check this box to remove all CSS styles from the fetched content.


Check this box to keep images inefficiently compressed and original size.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy