Analysis Guidelines
This section describes some best practices for analysis. These practices come from experience of analysts in the Data Mining Team. We list a few things you should do (or at least consider doing) and some pitfalls to avoid. We provide a list of issues to keep in mind that could affect the the quality of your results. Finally, a list of tools and data sets are referenced that might help in your analysis.
Analysis Quality
- Did you spend time thinking about what question you were answering?
- Did you engage potential users of your analysis to ensure you address right questions?
- How much effort did you put into checking the quality of the data?
- How reproducible is your analysis? If you were to pick up your project 6 months from now could you reuse anything?
- Did you review your write up to your satisfaction?
- Did you have others review your analysis artifacts (scripts, code, etc.)?
- Is your write up something you would be proud to publish?
- Do you think readers of your analysis summary can understand the key points easily and benefit from them?
Analysis Do's
- Look at the distribution of your data. Always look at histograms (value and counts) for key fields in your analysis and see what pops out. In most cases, you will find some surprises that need futher investigation before you dive into your real analysis.
- Skewed Distributions. Most of the data distributions we see in our are very skewed "heavy or long tailed"). For example, if you are analyzing queries, there may be a handful of queries that dominate (e.g., "google'). The metrics computed for a particular feature or vertical may be heavily skewed because of those few queries.
- Segmentation. Metrics are more useful when segmented appropriately — not all segments are necessarily useful, but almost always some kind of segmentation can provide more useful insights. E.g. segmenting by dominant/not-dominant query (head vs. tail, "super-head" vs. rest). For more on this see section on Segmentation. See also a good blog: http://www.kaushik.net/avinash/2010/05/web-analytics-segments-three-category-recommendations.html on segmentation from the Web Analytics expert Avinash Kaushik.
- Deep dive: Always look at some unaggregated data as part of your analysis -- especially for results that are surprisingly (both positively or negatively). Some good ideas are to use Magic Mirror to get a few sample sessions to see what users are doing in detail. While that will not answer questions you have, but it may raise a few questions that may not have been considered or show up some assumptions you made are false.
- Make sure the data is correct. Talk to people who generated the data to verify that every field you are using means what you think it means. Don't trust your intuition, always check. For example, when using DQ field from one of the databases it is good to verify which verticals are included in the DQ computation. Not all are included and the list of the ones that are included differs depending in Competitive and Live Metrics databases.
- Think about baselines. Make sure that the numbers you are comparing are meaningful in their comparison. Often some subset of the population cannot be meaningfully compared to the population as a whole. For example, it isn't terribly meaningful to compare IE entry point Bing users to the global Bing user population in terms of value, because the global Bing user population will be biased by low-value marketing users, have different demographics, etc. It may be that you will simply demonstrate that marketing users are less likely to return than IE and Toolbar users, which is expected, and not what you set out to prove at all.
- Think ahead about possible shortfalls of your methods. Build specific experiments to test whether these shortcomings are real. The beginning of any analysis should project should include an active brainstorm of possible reasons the analysis method would be flawed. The project should specifically build in experiments and data sets to attempt to prove or disprove those possible shortcomings. For example, when developing Session Success Rate, we realized that there were concerns that success due to answers would not be properly measured, invalidating the metric for answers-related experiments. To help shed light on this we ensured we tested on data for a known-good answers ranker flight, to ensure that Session Success Rate didn't tell the wrong story in that case.
- Ensure your metric can find both good and bad. Sometimes your tools will have biases which can be found by testing both good and bad examples. If you metric always says that things are good, it probably isn't useful. This can sometimes be accomplished by having some prior knowledge about good cases and bad cases, and ensuring both are included in your set. For example, imagine that your analysis intends to find the impact of exposure to various Bing features on usage of Bing. In this case, the analysis should include both features like Instant Answers, which we believe are a positive experience for our users, and features like no-results pages, which we believe aren't a good experience for our users. In this case, if our analysis says that both are really good things, or both are really bad things, then we know our analysis hasn't produced reliable results.
- Communicate the analysis results. Allocate time and put some effort into communicating the results of your analysis to your customers as well as to anyone who may potentially be interested. Don't wait for them to contact you. Contact them first and ask if they are interested.
Analysis Dont's
- Don't go too broad in the analysis. When trying to look at everything it's very easy to drown in data.
- Don't use a page view-level quantity to determine a cohort of users without extreme care. This can introduce unexpected biases due to coverage effects, which can influence broad features of the cohort.
- Don't be afraid to turn away from some analysis method which is proving unproductive. Just because you've written up a plan and scheduled time for a project doesn't mean you should be afraid to fail fast if that's the right thing to do.
Analysis Issues
- Precision: add error bars (e.g. 95% confidence intervals). This is especially important when working with sampled data (samples NIF streams or Magic Mirror). For example if we compare two estimates (e.g. CTR) that are different, but the 95% confidence intervals overlap, we can't say that they are different (though we can't say that they're equal either).
- Accuracy: depending on the "ground truth" and data set used for the analysis, there may be a bias that needs to be understood to put the analysis results in perspective. For example when using a particular flight for the analysis, there is a mechanism for selecting users to be in that flight — i.e. the users in the flight may not be a true random sample from the population your analysis is interested in, in which case there's a bias introduced into the analysis. There can also be temporal bias, e.g. due to seasonal effects: browsing patterns may be different during the weeks before Christmas than say in February. Day of the week effects could also be an issue (best to use multiples of 7 days for analysis data, e.g. 35 days). Also (unless there is very good reason for it), don t aggregate over very long periods of time as the signal will likely change over long time. This presents a trade-off between aggregating over short term thus having less data and larger error versus aggregating over long term thus having more data and better precision, but yielding less sensitivity to temporal effects. In general a four or five week period best balances this trade-off.
- Weighted aggregation: When computing aggregate values, one can choose to add different weights to different data points. Currently Foray (flight analysis) and LiveMetrics compute aggregate metrics in different ways: LiveMetrics gives each impression equal weight, whereas Foray gives each user equal weight (by first computing aggregates per user and then aggregating these values over all users). As a result the metrics values in LiveMetrics represent heavy users more than light users. The results obtained from these two methods can differ both quantitatively and qualitatively. Depending on the analysis one or the other (or neither) may be most appropriate.
Analysis Guidelines的更多相关文章
- Dynamic Library Design Guidelines
https://developer.apple.com/library/mac/documentation/DeveloperTools/Conceptual/DynamicLibraries/100 ...
- Bjarne Stroustrup announces C++ Core Guidelines
This morning in his opening keynote at CppCon, Bjarne Stroustrup announced the C++ Core Guidelines ( ...
- Code Review Checklist and Guidelines for C# Developers
Checklist1. Make sure that there shouldn't be any project warnings.2. It will be much better if Code ...
- C++ Core Guidelines
C++ Core Guidelines September 9, 2015 Editors: Bjarne Stroustrup Herb Sutter This document is a very ...
- Guidelines for Successful SoC Verification in OVM/UVM
By Moataz El-Metwally, Mentor Graphics Cairo Egypt Abstract : With the increasing adoption of OVM/UV ...
- Java Programming Guidelines
This appendix contains suggestions to help guide you in performing low-level program design and in w ...
- Why many EEG researchers choose only midline electrodes for data analysis EEG分析为何多用中轴线电极
Source: Research gate Stafford Michahial EEG is a very low frequency.. and literature will give us t ...
- Automated Memory Analysis
catalogue . 静态分析.动态分析.内存镜像分析对比 . Memory Analysis Approach . volatility: An advanced memory forensics ...
- Sentiment Analysis resources
Wikipedia: Sentiment analysis (also known as opinion mining) refers to the use of natural language p ...
随机推荐
- 第七篇:web之前端之ajax
前端之ajax 前端之ajax 本节内容 ajax介绍 原生js实现ajax jquery实现ajax json 跨域请求 1. ajax介绍 AJAX(Asynchronous Javascri ...
- ThinkPHP函数详解:cache方法
cache方法是3.0版本开始新增的缓存管理方法.注意:3.1.2版本后因cache方法并入原S方法,所以cache方法不再建议使用,用S方法即可. cache 用于缓存设置.获取.删除操作 用法ca ...
- YII中的表单挂件
利用助手(widget)在页面实现表单 控制器中 <?php class YiiFormController extends Controller { public function actio ...
- jqery选择器
根据可见性 属性 匹配元素 <!doctype html> <html lang="en"> <head> <meta charset=& ...
- android .9文件的一点处理
Android上面有很多平台,造成比较严重的碎片问题,适配比较困难,作为应用,一般都需要图文并茂,图片又是比较占资源的.面对缩放的问题,于是出来了矢量图片文件,作一点矢量处理,于是就是.9图片,IOS ...
- 第二篇:gradle脚本运行环境分析(gradle的语义模型)
引言:通过上一篇的论述,我们知道gradle脚本是如假包换的groovy代码,但是这个groovy代码是运行在他的上下文环境里面的,学名叫语义模型.这一篇我们就来看看他的语义模型到底是什么,如何使用. ...
- 球面墨卡托(Spherical Mercator)
地理信息描述空间位置相关的信息,在空间位置的表达中,需要基于空间参照系来保证数据精度以及不同数据源之间的相互叠加/空间分析操作.自Google Maps与2005年发布以来,电子地图服务与普通民众的日 ...
- C#面向对象的一些东西
最近在复习C#面向对象,也就是说常说的3大特性:封装.继承和多态.首先说一下封装,其实封装最大的目的也是为了实现代码的解耦和重用.代码也是安全的(对外它隐藏了具体的实现,就好比我们拿个遥控器就能操作电 ...
- 页面嵌套 Iframe 产生缓存导致页面数据不刷新问题
最近遇到个比较古怪的问题:当页面嵌套多个 Iframe 时会出现 Iframe 里包含的页面无法看到最新的页面信息. 初步解决方案,在 Iframe 指向的页面地址后缀添加一个随机数或者时间戳.这样能 ...
- windows phone 扫描二维码
在网上找了找扫描二维码的例子,利用ZXing库实现(下载),提供的Silverlight版本的下载,在网上搜了一下已经有wp的dll可用了,不过网上实现的条码扫描的例子还都是用的Silverlight ...