Developer interface¶
These are the interfaces that PyPAC exposes to developers.
Main interface¶
These are the most commonly used components of PyPAC.
-
class
pypac.
PACSession
(pac=None, proxy_auth=None, pac_enabled=True, response_proxy_fail_filter=None, exception_proxy_fail_filter=None, socks_scheme='socks5', **kwargs)[source]¶ A PAC-aware Requests Session that discovers and complies with a PAC file, without any configuration necessary. PAC file discovery is accomplished via the Windows Registry (if applicable), and the Web Proxy Auto-Discovery (WPAD) protocol. Alternatively, a PAC file may be provided in the constructor.
Parameters: - pac (PACFile) – The PAC file to consult for proxy configuration info.
If not provided, then upon the first request,
get_pac()
is called with default arguments in order to find a PAC file. - proxy_auth (requests.auth.HTTPProxyAuth) – Username and password proxy authentication.
- pac_enabled (bool) – Set to
False
to disable all PAC functionality, including PAC auto-discovery. - response_proxy_fail_filter – Callable that takes a
requests.Response
and returns a boolean for whether the response means the proxy used for the request should no longer be used. By default, the response is not inspected. - exception_proxy_fail_filter – Callable that takes an exception and returns
a boolean for whether the exception means the proxy used for the request should no longer be used.
By default,
requests.exceptions.ConnectTimeout
andrequests.exceptions.ProxyError
are matched. - socks_scheme (str) – Scheme to use when PAC file returns a SOCKS proxy. socks5 by default.
-
do_proxy_failover
(proxy_url, for_url)[source]¶ Parameters: Returns: The next proxy config to try, or ‘DIRECT’.
Raises: ProxyConfigExhaustedError – If the PAC file provided no usable proxy configuration.
-
get_pac
(**kwargs)[source]¶ Search for, download, and parse PAC file if it hasn’t already been done. This method is called upon the first use of
request()
, but can also be called manually beforehand if desired. Subsequent calls to this method will only return the obtained PAC file, if any.Returns: The obtained PAC file, if any. Return type: PACFile|None Raises: MalformedPacError – If something that claims to be a PAC file was downloaded but could not be parsed.
-
pac_enabled
= None¶ Set to
False
to disable all PAC functionality, including PAC auto-discovery.
-
proxy_auth
¶ Proxy authentication object.
-
request
(method, url, proxies=None, **kwargs)[source]¶ Raises: - ProxyConfigExhaustedError – If the PAC file provided no usable proxy configuration.
- MalformedPacError – If something that claims to be a PAC file was downloaded but could not be parsed.
- pac (PACFile) – The PAC file to consult for proxy configuration info.
If not provided, then upon the first request,
-
pypac.
get_pac
(url=None, js=None, from_os_settings=True, from_dns=True, timeout=2, allowed_content_types=None, session=None, **kwargs)[source]¶ Convenience function for finding and getting a parsed PAC file (if any) that’s ready to use.
Parameters: - url (str) – Download PAC from a URL. If provided, from_os_settings and from_dns are ignored.
- js (str) – Parse the given string as a PAC file. If provided, from_os_settings and from_dns are ignored.
- from_os_settings (bool) – Look for a PAC URL or filesystem path from the OS settings, and use it if present. Doesn’t do anything on non-Windows or non-macOS/OSX platforms.
- from_dns (bool) – Look for a PAC file using the WPAD protocol.
- timeout – Time to wait for host resolution and response for each URL.
- allowed_content_types – If the response has a
Content-Type
header, then consider the response to be a PAC file only if the header is one of these values. If not specified, the allowed types areapplication/x-ns-proxy-autoconfig
andapplication/x-javascript-config
. - session (requests.Session) – Used for getting potential PAC files. If not specified, a generic session is used.
Returns: The first valid parsed PAC file according to the criteria, or None if nothing was found.
Return type: PACFile|None
Raises: MalformedPacError – If something that claims to be a PAC file was obtained but could not be parsed.
-
pypac.
collect_pac_urls
(from_os_settings=True, from_dns=True, **kwargs)[source]¶ Get all the URLs that potentially yield a PAC file.
Parameters: Returns: A list of URLs that should be tried in order.
Return type:
-
pypac.
download_pac
(candidate_urls, timeout=1, allowed_content_types=None, session=None)[source]¶ Try to download a PAC file from one of the given candidate URLs.
Parameters: - candidate_urls (list[str]) – URLs that are expected to return a PAC file. Requests are made in order, one by one.
- timeout – Time to wait for host resolution and response for each URL. When a timeout or DNS failure occurs, the next candidate URL is tried.
- allowed_content_types – If the response has a
Content-Type
header, then consider the response to be a PAC file only if the header is one of these values. If not specified, the allowed types areapplication/x-ns-proxy-autoconfig
andapplication/x-javascript-config
. - session (requests.Session) – Used for getting potential PAC files. If not specified, a generic session is used.
Returns: Contents of the PAC file, or None if no URL was successful.
Return type: str|None
-
pypac.
pac_context_for_url
(url, proxy_auth=None, pac=None)[source]¶ This context manager provides a simple way to add rudimentary PAC functionality to code that cannot be modified to use
PACSession
, but obeys theHTTP_PROXY
andHTTPS_PROXY
environment variables.Upon entering this context, PAC discovery occurs with default parameters. If a PAC is found, then it’s asked for the proxy to use for the given URL. The proxy environment variables are then set accordingly.
Note that this provides a very simplified PAC experience that’s insufficient for some scenarios.
Parameters: - url – Consult the PAC for the proxy to use for this URL.
- proxy_auth (requests.auth.HTTPProxyAuth) – Username and password proxy authentication.
- pac (PACFile) – The PAC to use to resolve the proxy. If not provided,
get_pac()
is called with default arguments in order to find a PAC file.
PAC parsing and execution¶
Functions and classes for parsing and executing PAC files.
-
class
pypac.parser.
PACFile
(pac_js, **kwargs)[source]¶ Represents a PAC file.
JavaScript parsing and execution is handled by the dukpy library.
Load a PAC file from a given string of JavaScript. Errors during parsing and validation may raise a specialized exception.
Parameters: pac_js (str) – JavaScript that defines the FindProxyForURL() function. Raises: MalformedPacError – If the JavaScript could not be parsed, does not define FindProxyForURL(), or is otherwise invalid.
-
pypac.parser.
parse_pac_value
(value, socks_scheme=None)[source]¶ Parse the return value of
FindProxyForURL()
into a list. List elements will either be the string “DIRECT” or a proxy URL.For example, the result of parsing
PROXY example.local:8080; DIRECT
is a list containing stringshttp://example.local:8080
andDIRECT
.Parameters: Returns: Parsed output, with invalid elements ignored. Warnings are logged for invalid elements.
Return type:
-
pypac.parser.
proxy_url
(value, socks_scheme=None)[source]¶ Parse a single proxy config value from FindProxyForURL() into a more usable element.
The recognized keywords are
DIRECT
,PROXY
,SOCKS
,SOCKS4
, andSOCKS5
. See https://developer.mozilla.org/en-US/docs/Web/HTTP/Proxy_servers_and_tunneling/Proxy_Auto-Configuration_PAC_file#return_value_formatParameters: Returns: Parsed value, e.g.:
DIRECT
,http://example.local:8080
, orsocks5://example.local:8080
.Return type: Raises: ValueError – If input value is invalid.
PAC JavaScript functions¶
Python implementations of JavaScript functions needed to execute a PAC file.
These are injected into the JavaScript execution context. They aren’t meant to be called directly from Python, so the function signatures may look unusual.
Most docstrings below are adapted from http://findproxyforurl.com/netscape-documentation/.
-
pypac.parser_functions.
dateRange
(*args)[source]¶ Accepted forms:
dateRange(day)
dateRange(day1, day2)
dateRange(mon)
dateRange(month1, month2)
dateRange(year)
dateRange(year1, year2)
dateRange(day1, month1, day2, month2)
dateRange(month1, year1, month2, year2)
dateRange(day1, month1, year1, day2, month2, year2)
dateRange(day1, month1, year1, day2, month2, year2, gmt)
day
- is the day of month between 1 and 31 (as an integer).
month
- is one of the month strings:
JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC
year
- is the full year number, for example 1995 (but not 95). Integer.
gmt
- is either the string “GMT”, which makes time comparison occur in GMT timezone; if left unspecified, times are taken to be in the local timezone.
Even though the above examples don’t show, the “GMT” parameter can be specified in any of the 9 different call profiles, always as the last parameter.
If only a single value is specified (from each category:
day
,month
,year
), the function returns a true value only on days that match that specification. If both values are specified, the result is true between those times, including bounds.Return type: bool
-
pypac.parser_functions.
dnsDomainIs
(host, domain)[source]¶ Parameters: Returns: true iff the domain of hostname matches.
Return type:
-
pypac.parser_functions.
dnsDomainLevels
(host)[source]¶ Parameters: host (str) – is the hostname from the URL. Returns: the number (integer) of DNS domain levels (number of dots) in the hostname. Return type: int
-
pypac.parser_functions.
dnsResolve
(host)[source]¶ Resolves the given DNS hostname into an IP address, and returns it in the dot separated format as a string. Returns an empty string if there is an error
Parameters: host (str) – hostname to resolve Returns: Resolved IP address, or empty string if resolution failed. Return type: str
-
pypac.parser_functions.
isInNet
(host, pattern, mask)[source]¶ Pattern and mask specification is done the same way as for SOCKS configuration.
Parameters: - host (str) – a DNS hostname, or IP address. If a hostname is passed, it will be resolved into an IP address by this function.
- pattern (str) – an IP address pattern in the dot-separated format
- mask (str) – mask for the IP address pattern informing which parts of the IP address should be matched against. 0 means ignore, 255 means match.
Returns: True iff the IP address of the host matches the specified IP address pattern.
Return type:
-
pypac.parser_functions.
isPlainHostName
(host)[source]¶ Parameters: host (str) – the hostname from the URL (excluding port number). Returns: True iff there is no domain name in the hostname (no dots). Return type: bool
-
pypac.parser_functions.
isResolvable
(host)[source]¶ Tries to resolve the hostname.
Parameters: host (str) – is the hostname from the URL. Returns: true if succeeds. Return type: bool
-
pypac.parser_functions.
localHostOrDomainIs
(host, hostdom)[source]¶ Parameters: Returns: true if the hostname matches exactly the specified hostname, or if there is no domain name part in the hostname, but the unqualified hostname matches.
Return type:
-
pypac.parser_functions.
myIpAddress
()[source]¶ Returns: the IP address of the host that the Navigator is running on, as a string in the dot-separated integer format. Return type: str
-
pypac.parser_functions.
shExpMatch
(host, pattern)[source]¶ Case-insensitive host comparison using a shell expression pattern.
Parameters: Return type:
-
pypac.parser_functions.
timeRange
(*args)[source]¶ Accepted forms:
timeRange(hour)
timeRange(hour1, hour2)
timeRange(hour1, min1, hour2, min2)
timeRange(hour1, min1, sec1, hour2, min2, sec2)
timeRange(hour1, min1, sec1, hour2, min2, sec2, gmt)
hour
- is the hour from 0 to 23. (0 is midnight, 23 is 11 pm.)
min
- minutes from 0 to 59.
sec
- seconds from 0 to 59.
gmt
- either the string “GMT” for GMT timezone, or not specified, for local timezone. Again, even though the above list doesn’t show it, this parameter may be present in each of the different parameter profiles, always as the last parameter.
Returns: True during (or between) the specified time(s). Return type: bool
-
pypac.parser_functions.
weekdayRange
(start_day, end_day=None, gmt=None)[source]¶ Accepted forms:
weekdayRange(wd1)
weekdayRange(wd1, gmt)
weekdayRange(wd1, wd2)
weekdayRange(wd1, wd2, gmt)
If only one parameter is present, the function yields a true value on the weekday that the parameter represents. If the string “GMT” is specified as a second parameter, times are taken to be in GMT, otherwise in local timezone.
If both
wd1
and wd2`` are defined, the condition is true if the current weekday is in between those two weekdays. Bounds are inclusive. If thegmt
parameter is specified, times are taken to be in GMT, otherwise the local timezone is used.Weekday arguments are one of
MON TUE WED THU FRI SAT SUN
.Parameters: Return type:
Proxy resolution¶
Tools for working with a given PAC file and its return values.
-
class
pypac.resolver.
ProxyResolver
(pac, proxy_auth=None, socks_scheme='socks5')[source]¶ Handles the lookup of the proxy to use for any given URL, including proxy failover logic.
Parameters: - pac (pypac.parser.PACFile) – Parsed PAC file.
- proxy_auth (requests.auth.HTTPProxyAuth) – Username and password proxy authentication. If provided, then all proxy URLs returned will include these credentials.
- socks_scheme (str) – Scheme to assume for SOCKS proxies. socks5 by default.
-
pypac.resolver.
add_proxy_auth
(possible_proxy_url, proxy_auth)[source]¶ Add a username and password to a proxy URL, if the input value is a proxy URL.
Parameters: - possible_proxy_url (str) – Proxy URL or
DIRECT
. - proxy_auth (requests.auth.HTTPProxyAuth) – Proxy authentication info.
Returns: Proxy URL with auth info added, or
DIRECT
.Return type: - possible_proxy_url (str) – Proxy URL or
WPAD functions¶
Tools for the Web Proxy Auto-Discovery Protocol.
-
pypac.wpad.
proxy_urls_from_dns
(local_hostname=None)[source]¶ Generate URLs from which to look for a PAC file, based on a hostname. Fully-qualified hostnames are checked against the Public Suffix List to ensure that generated URLs don’t go outside the scope of the organization. If the fully-qualified hostname doesn’t have a recognized TLD, such as in the case of intranets with ‘.local’ or ‘.internal’, the TLD is assumed to be the part following the rightmost dot.
Parameters: local_hostname (str) – Hostname to use for generating the WPAD URLs. If not provided, the local hostname is used. Returns: PAC URLs to try in order, according to the WPAD protocol. If the hostname isn’t qualified or is otherwise invalid, an empty list is returned. Return type: list[str]
OS stuff¶
Tools for getting the configured PAC file URL out of the OS settings.
-
pypac.os_settings.
ON_DARWIN
= False¶ True if running on macOS/OSX.
-
pypac.os_settings.
ON_WINDOWS
= False¶ True if running on Windows.
-
pypac.os_settings.
autoconfig_url_from_preferences
()[source]¶ Get the PAC
AutoConfigURL
value from the macOS System Preferences. This setting is visible as the “URL” field in System Preferences > Network > Advanced… > Proxies > Automatic Proxy Configuration.Returns: The value from the registry, or None if the value isn’t configured or available. Note that it may be local filesystem path instead of a URL. Return type: str|None Raises: NotDarwinError – If called on a non-macOS/OSX platform.
-
pypac.os_settings.
autoconfig_url_from_registry
()[source]¶ Get the PAC
AutoConfigURL
value from the Windows Registry. This setting is visible as the “use automatic configuration script” field in Internet Options > Connection > LAN Settings.Returns: The value from the registry, or None if the value isn’t configured or available. Note that it may be local filesystem path instead of a URL. Return type: str|None Raises: NotWindowsError – If called on a non-Windows platform.
-
pypac.os_settings.
file_url_to_local_path
(file_url)[source]¶ Parse a
AutoConfigURL
value withfile://
scheme into a usable local filesystem path.Parameters: file_url – Must start with file://
.Returns: A local filesystem path. It might not exist.
-
class
pypac.os_settings.
NotWindowsError
[source]
-
class
pypac.os_settings.
NotDarwinError
[source]