webutil
文件大小: unknow
源码售价: 5 个金币 积分规则     积分充值
资源说明:Utility to analyze websites
# webutil

webutil provides automatic website analysis for the savvy web developer. It helps you build faster websites and pinpoint bottlenecks. It provides:

* quick glance summary of key metrics
* HTTP and JavaScript error reporting
* caching analysis with an empty and primed cache
* detection of 27 commonly used front end libraries, content management systems and third party embeds (aka What's That Site Running?)
* automatic asset (HTML, JavaScript, CSS, images) optimization facility:
    * minification
    * compression
    * asset optimization benefit analysis
* header verification and analysis
* built in user agent support with screen dimensions
* standard `phantomjs` facilities
    * screenshots
    * HAR file generation

You can use it to:

* quickly breakdown and get insights into existing websites
* determine performance bottlenecks
* optimize assets automatically
* cron it and get notified when HTTP or JavaScript errors occur on your production websites

webutil is a [phantomjs](http://phantomjs.org)-based tool.

## INSTALL

1. Install [phantomjs](http://phantomjs.org/download.html)
2. Get the code: `git clone git://github.com/ditesh/webutil.js`
3. Run it: `chmod a+x wush; ./wush www.reddit.com`

## What can it do?

Lots. Let's get relevant stats for Reddit:

    $ chmod a+x wush  # make the shell script executable
    $ ./wush reddit.com
    webutil.js 1.0.1 (c) 2012-2013 Ditesh Gathani 

    Summary
    Requests    33 request(s), 385103 bytes (376 KB), 0 redirect(s)
    Resources   HTML: 2, CSS: 1, JS: 4, images: 26, others: 0
                UTF-8: 4, ISO-8859-1: 0, others: 0, not-specified: 29
                compressed: 27, not-compressed: 6
    Timing      first byte: 291 ms, onDOMContentLoaded: 667 ms, onLoad: 2386 ms, fully loaded: 4390 ms
    Errors      4xx: 0, 5xx: 0, JS: 0

As you can see, it provides a nicely formatted summary about the site highlighting key areas. Timing information, so often key to web developers, is covered in more detail below.

### Get a page weight breakdown

Get a page weight breakdown by resources:

    $ ./wush -b reddit.com
    webutil 1.0.1 (c) 2012-2013 Ditesh Gathani 

    Summary
    ... snipped for brevity ...

    Breakdown
    HTML      3 files   107243 bytes (105 KB)
    JS        9 files   338664 bytes (331 KB)
    CSS       3 files   79286 bytes (77 KB)
    PNG       7 files   29916 bytes (29 KB)
    GIF       4 files   1245 bytes (1 KB)
    JPG      15 files   31348 bytes (31 KB)

Lets run it again but only have it consider resources that are within reddit's control (ie, excluding third party embeds). The `-dd` parameter achieves this (more on this below):

    $ ./wush -b -dd reddit,media reddit.com
    webutil 1.0.1 (c) 2012-2013 Ditesh Gathani 

    Summary
    ... snipped for brevity ...

    Breakdown
    HTML      2 files   104255 bytes (102 KB)
    CSS       2 files   77670 bytes (76 KB)
    JS        3 files   126757 bytes (124 KB)
    GIF       2 files   1167 bytes (1 KB)
    PNG       6 files   25966 bytes (25 KB)
    JPG      18 files   35354 bytes (35 KB)


### List URL's and content types

Now, lets get a complete list of URL's accessed by the browser when loading up the site:

    $ ./wush -lu reddit.com
    webutil.js 1.0.1 (c) 2012-2013 Ditesh Gathani 

    Summary
    ... snipped for brevity ...

    URI's
    HTML  115084  http://www.reddit.com/
    JS    45830   http://ajax.googleapis.com/ajax/libs/jquery/1.7.2/jquery.min.js
    CSS   78617   http://www.redditstatic.com/reddit._E2WnMcei1o.css
    JS    58064   http://www.redditstatic.com/reddit.en.yRpDmsrGWVQ.js
    ... snipped for brevity ...

Content type, resource size and resource URL is provided. Occasionally, we are only interested in resources from the same domain. This can be achieved by using the `-d` parameter:

    $ ./wush -lu -d reddit.com
    webutil.js 1.0.1 (c) 2012-2013 Ditesh Gathani 

    Summary
    ... snipped for brevity ...

    URI's
    HTML  110196  http://www.reddit.com/
    HTML  2108    http://www.reddit.com/api/request_promo

Whoops, clearly the URL list is wrong. As it turns out, reddit.com keeps its assets in other domains. We need to rerun the command using the -dd to specify other domains reddit.com uses.

    $ ./wush -u -dd reddit,media reddit.com
    webutil.js 1.0.1 (c) 2012-2013 Ditesh Gathani 

    Summary
    ... snipped for brevity ...

    URI's
    HTML  114590  http://www.reddit.com/
    CSS   78497   http://www.redditstatic.com/reddit._E2WnMcei1o.css
    JS    68817   http://www.redditstatic.com/reddit.en.yRpDmsrGWVQ.js
    PNG   9670    http://www.redditstatic.com/sprite-reddit.HLCFG7U22Hg.png
    ... snipped for brevity ...

The `-dd` pulls up URL's by doing pattern matching on all hostnames (ie, in this case, hostname is matched against the words `reddit` and `media`). This allows finegrained controlled of which URL's are to be included in the analysis.

The important thing to note about `-d` and `-dd` is that the numbers in the summary are reflective of the URL's analyzed.

### List resources by compression status

Listing resources by compression status is easy:

    $ ./wush -lc reddit.com
    webutil 1.0.1 (c) 2012-2013 Ditesh Gathani 

    Summary
    ... snipped for brevity ...

    URI's (Compressed)
        http://www.reddit.com/
        http://ajax.googleapis.com/ajax/libs/jquery/1.7.2/jquery.min.js
        http://www.redditstatic.com/subreddit-stylesheet/l5HnWp45tKh_Hs__SZks99bdVJ8.css
        http://www.redditstatic.com/reddit.en.NiOpBLl9IG8.js
        http://www.redditstatic.com/reddit-init.en.uiM-SGfQunU.js
        http://static.adzerk.net/Extensions/adFeedback.css
        ... snipped for brevity ...

    URI's (Not Compressed)
        http://www.redditstatic.com/welcome-lines.png
        http://www.redditstatic.com/kill.png
        http://www.redditstatic.com/droparrowgray.gif
        ... snipped for brevity ...

This is a good way of spotting whether server side compression is configured correctly.

### Listing resources by encryption status

Listing resources by encryption status is easy as well:

    $ ./wush -ls https://news.ycombinator.com
    webutil 1.0.1 (c) 2012-2013 Ditesh Gathani 

    Summary
    ... snipped for brevity ...

    URI's (Encrypted)
        https://news.ycombinator.com/
        https://news.ycombinator.com/news.css
        https://news.ycombinator.com/y18.gif
        https://news.ycombinator.com/s.gif
        https://news.ycombinator.com/grayarrow.gif

    URI's (Not Encrypted)
    None

### Listing resources by charset

To list resources by charset:

    $ ./wush -le https://news.ycombinator.com
    webutil 1.0.1 (c) 2012-2013 Ditesh Gathani 

    Summary
    ... snipped for brevity ...

    Charset (utf-8)
        http://www.reddit.com/
        http://ajax.googleapis.com/ajax/libs/jquery/1.7.2/jquery.min.js
        http://ajax.googleapis.com/ajax/libs/jquery/1.7.1/jquery.min.js
        http://www.reddit.com/api/request_promo

    Charset (not specified)
        http://www.redditstatic.com/reddit.Ib1IJ_64tM4.css
        http://www.redditstatic.com/reddit-init.en.uiM-SGfQunU.js
        ... snipped for brevity ...

### Listing redirects

To list redirects:

    $ ./wush -lr reddit.com
    webutil 1.0.1 (c) 2012-2013 Ditesh Gathani 

    Summary
    ... snipped for brevity ...

    Redirects
    302 http://reddit.com/  http://www.reddit.com/


### Primed Cache

There is an inbuilt facility to provide the summary but with a primed cache. This is a good way to check whether the browser is, in fact, caching the site on subsequent visits.

    # One execution
    $ ./wush reddit.com
    webutil.js 1.0.1 (c) 2012-2013 Ditesh Gathani 

    Summary
    Requests    33 request(s), 385103 bytes (376 KB), 0 redirect(s)
    Resources   HTML: 2, CSS: 1, JS: 4, images: 26, others: 0
                UTF-8: 4, ISO-8859-1: 0, others: 0, not-specified: 29
                compressed: 27, not-compressed: 6
    Timing      first byte: 291 ms, onDOMContentLoaded: 667 ms, onLoad: 2386 ms, fully loaded: 4390 ms
    Errors      4xx: 0, 5xx: 0, JS: 0

    # Two executions
    $ ./wush -c 1 reddit.com
    webutil.js 1.0.1 (c) 2012-2013 Ditesh Gathani 

    Summary
    Requests    14 request(s), 135463 bytes (132 KB), 0 redirect(s)
    Resources   HTML: 2, CSS: 0, JS: 1, images: 11, others: 0
                UTF-8: 3, ISO-8859-1: 0, others: 0, not-specified: 11
                compressed: 12, not-compressed: 2
    Timing      first byte: 291 ms, onDOMContentLoaded: 667 ms, onLoad: 2386 ms, fully loaded: 4390 ms
    Errors      4xx: 0, 5xx: 0, JS: 0

When the `-c` parameter is passed, the page is reloaded the specified number of times. In the example above, the page is loaded once, and then reloaded one more time (reflecting the effect of passing `-c 1`).

You can see the benefits of caching as the number of requests, page size etc drop. This is best used with `-d` or `-dd` to exclude third party resources.

### Timing

Timing information is important for web developers. `webutil` offers four parameters:

* first byte: measurement of latency between request and first chunk of the first response
* onDOMContentLoaded: fires when the page's DOM is fully constructed, but the referenced resources may not finish loading
* onLoad: fires when the document loading completes
* fully loaded: fires when there is no more network activity

The four parameters are clearly visible below under the Timing section:

    $ ./wush reddit.com
    webutil.js 1.0.1 (c) 2012-2013 Ditesh Gathani 

    Summary
    Requests    33 request(s), 385103 bytes (376 KB), 0 redirect(s)
    Resources   HTML: 2, CSS: 1, JS: 4, images: 26, others: 0
                UTF-8: 4, ISO-8859-1: 0, others: 0, not-specified: 29
                compressed: 27, not-compressed: 6
    Timing      first byte: 291 ms, onDOMContentLoaded: 667 ms, onLoad: 2386 ms, fully loaded: 4390 ms
    Errors      4xx: 0, 5xx: 0, JS: 0

There is an additional `python` based tool that will help execute multiple runs and provide average and standard deviation timing information. This is invoked transparently as follows:

    $ ./wush -repeat 10 reddit.com
    webutil 1.0.1 (c) 2012-2013 Ditesh Gathani 
    Executing 10 runs on reddit.com

    First Byte:   mean 162.6ms, standard deviation: 30.45ms
    DOMContentLoaded: mean 612.8ms, standard deviation: 32.29ms
    On Load:    mean 1767.4ms, standard deviation: 109.02ms
    Fully Loaded:   mean 3771.3ms, standard deviation: 109.0ms

For reference, the python script is available in `tools/timing.py`

### List HTTP and JavaScript errors

List HTTP errors (4xx, 5xx) and JavaScript errors as follows:

    # We use www.klia.com.my as the site has some errors
    $ ./wush -he -je www.klia.com.my
    webutil 1.0.1 (c) 2012-2013 Ditesh Gathani 

    Summary
    ... snipped for brevity ...

    JavaScript errors
    "SyntaxError: Parse error"

    HTTP errors
    404 http://www.klia.com.my/images/btt_airports_over.gif
    404 http://www.klia.com.my/images/btt_news_over.gif
    404 http://www.klia.com.my/images/btt_investor_over.gif
    404 http://www.klia.com.my/btt_airlines_over.gif
    404 http://www.klia.com.my/images/btt_careers_over.gif
    404 http://www.klia.com.my/images/btt_social_over.gif
    404 http://www.klia.com.my/images/btt_about_over.gif
    404 http://www.klia.com.my/klia2008eng/global_js/menu_dom.js

### JSON output

Pass `-json` for JSON-only output:

    $ ./wush -json -b reddit.com
    {"HTML":{"count":3,"size":115683},"JS":{"count":9,"size":279306},"CSS":{"count":3,"size":79210},"GIF":{"count":4,"size":1245},"PNG":{"count":7,"size":28463},"JPG":{"count":19,"size":37402}}

### Optimizing

There are a couple bundled shell scripts that automatically downloads assets (CSS, JS, images), minifies CSS and JS, and optimizes all  images.

The optimizer script optionally attempts to convert JPG's to PNG's, PNG's to JPG's and GIF's to PNG's to identify which generates the smallest filesize. This is a win if JPG's and GIF's are converted to PNG's but not necessarily so when PNG's are converted to JPG's (as you lose transparency capabilities).

    $ chmod a+x tools/downloader tools/optimizer
    $ tools/downloader -dd klia www.klia.com.my

    Web Downloader Tool 1.0.1 (c) 2013 Ditesh Gathani 

    Running webutil ... done.
    Checking for PNG files ... found ... downloading ... done.
    Checking for JPEG files ... found ... downloading ... done.
    Checking for GIF files ... found ... downloading ... done.
    Checking for CSS files ... found ... downloading ... done.
    Checking for JS files ... found ... downloading ... done.

    # Run the optimizer with conversion turned on
    # tools/optimizer -convert

    Web Optimization Tool 1.0.1 (c) 2013 Ditesh Gathani 

    Checking for PNG files ... optimizing ... optimization completed, 1 file(s) converted ... done.
    Checking for JPEG files ... optimizing ... optimization completed, 1 file(s) converted ... done.
    Checking for GIF files ... optimizing ... optimization completed, 20 file(s) converted ... done.
    Checking for CSS files ... optimizing ... done.
    Checking for JS files ... optimizing ...done.

    RESULTS (with no compression)

    ASSET  COUNT  AS-IS  OPTIMIZED  DIFF(KB)  DIFF(%)
    -----  -----  -----  ---------  --------  -------
    GIF    52     149KB  138KB      11KB      7.38%
    JPG    11     248KB  231KB      17KB      6.85%
    PNG    3      19KB   14KB       5KB       26.31%
    CSS    6      52KB   41KB       11KB      21.15%
    JS     15     221KB  159KB      62KB      28.05%
    TOTAL  87     689KB  583KB      106KB     15.38%

    AWS bandwidth savings: USD$ 19.20 per million visits

    RESULTS (with gzip compression)

    RESOURCE  COUNT  AS-IS  OPTIMIZED  DIFF(KB)  DIFF(%)
    -----     -----  -----  ---------  --------  -------
    GIF       52     145KB  138KB      7KB       4.82%
    JPG       11     227KB  218KB      9KB       3.96%
    PNG       3      19KB   14KB       5KB       26.31%
    CSS       6      13KB   11KB       2KB       15.38%
    JS        15     70KB   53KB       17KB      24.28%
    TOTAL     87     474KB  434KB      40KB      8.43%

    AWS bandwidth savings: USD$ 7.24 per million visits

    # Run the optimizer with conversion turned off
    # tools/optimizer

    RESULTS (with no compression)

    ASSET  COUNT  AS-IS  OPTIMIZED  DIFF(KB)  DIFF(%)
    -----  -----  -----  ---------  --------  -------
    GIF    52     149KB  149KB      0KB       0%
    JPG    11     248KB  234KB      14KB      5.64%
    PNG    3      19KB   16KB       3KB       15.78%
    CSS    6      52KB   41KB       11KB      21.15%
    JS     15     221KB  159KB      62KB      28.05%
    TOTAL  87     689KB  599KB      90KB      13.06%

    AWS bandwidth savings: USD$ 16.30 per million visits

    RESULTS (with gzip compression)

    RESOURCE  COUNT  AS-IS  OPTIMIZED  DIFF(KB)  DIFF(%)
    -----     -----  -----  ---------  --------  -------
    GIF       52     145KB  145KB      0KB       0%
    JPG       11     227KB  220KB      7KB       3.08%
    PNG       3      19KB   16KB       3KB       15.78%
    CSS       6      13KB   11KB       2KB       15.38%
    JS        15     70KB   53KB       17KB      24.28%
    TOTAL     87     474KB  445KB      29KB      6.11%

    AWS bandwidth savings: USD$ 5.25 per million visits

A few things to note here:

* [UglifyJS2](https://github.com/mishoo/UglifyJS2) is used to minify, optimize and compress JavaScript files
* [csstidy](http://csstidy.sourceforge.net/) is used to optimize CSS files
* [optipng](http://optipng.sourceforge.net/) is used to optimize PNG files
* [jpegtran](http://jpegclub.org/jpegtran/) is used to optimize JPEG files
* [optipng](http://optipng.sourceforge.net/) is used to optimize GIF files by converting them to PNG only if the GIF is not animated and the resulting filesize is smaller
* Bandwidth calculations are based on the 1-10TB pricing for the Singaporean AWS region
* Downloaded resources are available in `/tmp/webutil/js/pre` (replace js with css/png/jpg)
* Optimized resources are available in `/tmp/webutil/js/post` (replace js with css/png/jpg)

### Credits

1. PhantomJS
2. Sniffer
3. Steve Souder's Spriter
4. Redbot
5. WebPageTest

本源码包内暂不包含可直接显示的源代码文件,请下载源码包。