Advice for applying Machine Learning
https://jmetzen.github.io/2015-01-29/ml_advice.html
Advice for applying Machine Learning
This post is based on a tutorial given in a machine learning course at University of Bremen. It summarizes some recommendations on how to get started with machine learning on a new problem. This includes
- ways of visualizing your data
- choosing a machine learning method suitable for the problem at hand
- identifying and dealing with over- and underfitting
- dealing with large (read: not very small) datasets
- pros and cons of different loss functions.
The post is based on "Advice for applying Machine Learning" from Andrew Ng. The purpose of this notebook is to illustrate the ideas in an interactive way. Some
of the recommendations are debatable. Take them as suggestions, not as strict rules.
import time
import numpy as np
np.random.seed(0)
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
Dataset
We will generate some simple toy data using sklearn's make_classification
function:
from sklearn.datasets import make_classification
X, y = make_classification(1000, n_features=20, n_informative=2,
n_redundant=2, n_classes=2, random_state=0) from pandas import DataFrame
df = DataFrame(np.hstack((X, y[:, None])),
columns = range(20) + ["class"])
Notice that we generate a dataset for binary classification consisting of 1000 datapoints and 20 feature dimensions. We have used the DataFrame
class
from pandas to encapsulate the data and the classes into one joint data structure. Let's take a look at the first 5 datapoints:
df[:5]
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | ... | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | class | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | -1.063780 | 0.676409 | 1.069356 | -0.217580 | 0.460215 | -0.399167 | -0.079188 | 1.209385 | -0.785315 | -0.172186 | ... | -0.993119 | 0.306935 | 0.064058 | -1.054233 | -0.527496 | -0.074183 | -0.355628 | 1.057214 | -0.902592 | 0 |
1 | 0.070848 | -1.695281 | 2.449449 | -0.530494 | -0.932962 | 2.865204 | 2.435729 | -1.618500 | 1.300717 | 0.348402 | ... | 0.225324 | 0.605563 | -0.192101 | -0.068027 | 0.971681 | -1.792048 | 0.017083 | -0.375669 | -0.623236 | 1 |
2 | 0.940284 | -0.492146 | 0.677956 | -0.227754 | 1.401753 | 1.231653 | -0.777464 | 0.015616 | 1.331713 | 1.084773 | ... | -0.050120 | 0.948386 | -0.173428 | -0.477672 | 0.760896 | 1.001158 | -0.069464 | 1.359046 | -1.189590 | 1 |
3 | -0.299517 | 0.759890 | 0.182803 | -1.550233 | 0.338218 | 0.363241 | -2.100525 | -0.438068 | -0.166393 | -0.340835 | ... | 1.178724 | 2.831480 | 0.142414 | -0.202819 | 2.405715 | 0.313305 | 0.404356 | -0.287546 | -2.847803 | 1 |
4 | -2.630627 | 0.231034 | 0.042463 | 0.478851 | 1.546742 | 1.637956 | -1.532072 | -0.734445 | 0.465855 | 0.473836 | ... | -1.061194 | -0.888880 | 1.238409 | -0.572829 | -1.275339 | 1.003007 | -0.477128 | 0.098536 | 0.527804 | 0 |
5 rows × 21 columns
It is hard to get any clue of the problem by looking at the raw feature values directly, even on this low-dimensional example. Thus, there is a wealth of ways of providing more accessible views of your data; a small subset of these is discussed in the next
section.
Visualization
First step when approaching a new problem should nearly always be visualization, i.e., looking
at your data.
Seaborn is a great package for statistical data visualization. We use some of its functions to explore the data.
First step is to generate scatter-plots and histograms using the pairplot
.
The two colors correspond to the two classes and we use a subset of the features and only the first 50 datapoints to keep things simple.
_ = sns.pairplot(df[:50], vars=[8, 11, 12, 14, 19], hue="class", size=1.5)
Based on the histograms, we can see that some features are more helpful to distinguish the two classes than other. In particular, feature 11 and 14 seem to be informative. The scatterplot of those two features shows that the classes are almost linearly separable
in this 2d space. A further thing to note is that feature 12 and 19 are highly anti-correlated. We can explore correlations more systematically by using corrplot
:
plt.figure(figsize=(12, 10))
_ = sns.corrplot(df, annot=False)
We can see our observations from before confirmed here: feature 11 and 14 are strongly correlated with the class (they are informative). They are also strongly correlated with each other (via the class). Furthermore, feature 12 is highly anti-correlated with
feature 19, which in turn is correlated with feature 14. We have thus some redundant features. This can be problematic for some classifiers, e.g., naive Bayes which assumes the features being independent given the class. The remaining features are mostly noise;
they are neither correlated with each other nor with the class.
Notice that data visualization becomes more challenging if you have more feature dimensions and less datapoints. We give an example for visualiszing high-dimensional data later.
Choice of the method
Once we have visually explored the data, we can start applying machine learning to it. Given the wealth of methods for machine learning, it is often not easy to decide which method to try first. This simple cheat-sheet (credit goes to Andreas
Müller and the sklearn-team) can help to select an appropriate ML method for your problem (see http://dlib.net/ml_guide.svg for
an alternative cheat sheet).
from IPython.display import Image
Image(filename='ml_map.png', width=800, height=600)
Since we have 1000 samples, are predicting a category, and have labels, the sheet recommends that we use a LinearSVC
(which
stands for support vector classification with linear kernel and uses an efficient algorithm for solving this particular problem) first. So we give it a try. LinearSVC
requires
to select the regularization; we use the standard L2-norm penalty and C=10. We plot a learning curve for both the training score and the validation score (score corresponds to accuracy in this case):
from sklearn.svm import LinearSVC
plot_learning_curve(LinearSVC(C=10.0), "LinearSVC(C=10.0)",
X, y, ylim=(0.8, 1.01),
train_sizes=np.linspace(.05, 0.2, 5))
We can notice that there is a large gap between error on training and on validation data. What does that mean? We are probably overfitting the
training data!
Adressing overfitting
There are different ways to decreasing overfitting:
- increase number of training examples (getting more data is common wish of machine learning
practitioners)
plot_learning_curve(LinearSVC(C=10.0), "LinearSVC(C=10.0)",
X, y, ylim=(0.8, 1.1),
train_sizes=np.linspace(.1, 1.0, 5))
We see that our validation score becomes larger with more data and the gap closes; thus we are now longer overfitting. There are different ways of obtaining more data, for instance we (a) might invest the effort of collecting more, (b) create some artificially
based on the existing ones (for images, e.g., rotation, translation, distortion), or (c) add artificial noise.
If none of these approaches is applicable and thus more data would not be available, we could alternatively
- decrease the number of features (we know from our visualizations that features 11 and 14 are
most informative)
plot_learning_curve(LinearSVC(C=10.0), "LinearSVC(C=10.0) Features: 11&14",
X[:, [11, 14]], y, ylim=(0.8, 1.0),
train_sizes=np.linspace(.05, 0.2, 5))
Note that this is a bit cheating since we have selected the features manually and on more data than we gave the classifier. We could use automatic feature selection alternatively:
from sklearn.pipeline import Pipeline
from sklearn.feature_selection import SelectKBest, f_classif
# SelectKBest(f_classif, k=2) will select the k=2 best features according to their Anova F-value plot_learning_curve(Pipeline([("fs", SelectKBest(f_classif, k=2)), # select two features
("svc", LinearSVC(C=10.0))]),
"SelectKBest(f_classif, k=2) + LinearSVC(C=10.0)",
X, y, ylim=(0.8, 1.0),
train_sizes=np.linspace(.05, 0.2, 5))
This worked remarkably well. Feature selection is simple on this toy data. It should be noted that feature selection is only one special kind of reducing the model's complexity. Others would be: (a) reduce the degree of a polynomial model in linear regression,
(b) reduce the number of nodes/layers of an artificial neural network, (c) increase bandwidth of an RBF-kernel etc.
One question remains: why can't the classifier identify the useful features on its own? Let's first turn to a further alternative to decrease overfitting:
- increase regularization of classifier (decrease parameter C of Linear SVC)
plot_learning_curve(LinearSVC(C=0.1), "LinearSVC(C=0.1)",
X, y, ylim=(0.8, 1.0),
train_sizes=np.linspace(.05, 0.2, 5))
This already helped a bit. We can also select the regularization of the classifier automatically using a grid-search based on cross-validation:
from sklearn.grid_search import GridSearchCV
est = GridSearchCV(LinearSVC(),
param_grid={"C": [0.001, 0.01, 0.1, 1.0, 10.0]})
plot_learning_curve(est, "LinearSVC(C=AUTO)",
X, y, ylim=(0.8, 1.0),
train_sizes=np.linspace(.05, 0.2, 5))
print "Chosen parameter on 100 datapoints: %s" % est.fit(X[:100], y[:100]).best_params_
Chosen parameter on 100 datapoints: {'C': 0.01}
In general, feature selection looked better. Can the classifier identify useful features on its own? Recall that LinearSVC also supports the l1 penalty, which results in sparse solutions. Sparse solutions correspond to an implicit feature selection. Let's try
this:
plot_learning_curve(LinearSVC(C=0.1, penalty='l1', dual=False),
"LinearSVC(C=0.1, penalty='l1')",
X, y, ylim=(0.8, 1.0),
train_sizes=np.linspace(.05, 0.2, 5))
This also looks quite well. Let's investigate the coefficients learned:
est = LinearSVC(C=0.1, penalty='l1', dual=False)
est.fit(X[:150], y[:150]) # fit on 150 datapoints
print "Coefficients learned: %s" % est.coef_
print "Non-zero coefficients: %s" % np.nonzero(est.coef_)[1]
Coefficients learned: [[ 0. 0. 0. 0. 0. 0.01857999
0. 0. 0. 0.004135 0. 1.05241369
0.01971419 0. 0. 0. 0. -0.05665314
0.14106505 0. ]]
Non-zero coefficients: [ 5 9 11 12 17 18]
Most coefficients are zero (the corresponding feature was ignored) and by far the strongest weight is put onto feature 11.
A different dataset
We generate another dataset for binary classification and apply a LinearSVC
again.
from sklearn.datasets import make_circles
X, y = make_circles(n_samples=1000, random_state=2)
plot_learning_curve(LinearSVC(C=0.25), "LinearSVC(C=0.25)",
X, y, ylim=(0.5, 1.0),
train_sizes=np.linspace(.1, 1.0, 5))
Wow, that was very bad, even the training error is not better than random. What is a possible reason for this? Would any of the above recipes (more data, feature selection, increase regularization) help?
Turns out: No. We are in a completely different situation: Before, the training score was always close to perfect and we had to address overfitting. This time, training error is also very low. We are underfitting.
Let us take a look at the data:
df = DataFrame(np.hstack((X, y[:, None])),
columns = range(2) + ["class"])
_ = sns.pairplot(df, vars=[0, 1], hue="class", size=3.5)
This data is clearly not linearly separable; more data or less features cannot help. Our model is wrong; thus the underfitting.
Adressing underfitting
Ways to decrease underfitting:
- use more or better features (the distance from the origin should help!)
# add squared distance from origin as third feature
X_extra = np.hstack((X, X[:, [0]]**2 + X[:, [1]]**2)) plot_learning_curve(LinearSVC(C=0.25), "LinearSVC(C=0.25) + distance feature",
X_extra, y, ylim=(0.5, 1.0),
train_sizes=np.linspace(.1, 1.0, 5))
Perfectly! But we had to invest some hard thinking (well, kind of) to come up with this feature. Maybe the classifier could do that kind of automatically? This requires to
- use more a complex model (reduced regularization and/or non-linear kernel)
from sklearn.svm import SVC
# note: we use the original X without the extra feature
plot_learning_curve(SVC(C=2.5, kernel="rbf", gamma=1.0),
"SVC(C=2.5, kernel='rbf', gamma=1.0)",
X, y, ylim=(0.5, 1.0),
train_sizes=np.linspace(.1, 1.0, 5))
Yes, that also works satisfactorily!
Larger datasets and higher-dimensional feature spaces
Back to the original dataset, but this time with many more features and datapoints and 5 classes. LinearSVC would be a bit slow on this dataset size; the cheat sheet recommends using SGDClassifier
.
This classifier learns a linear model (just as LinearSVC
or
logistic regression) but uses stochastic gradient descent for training (just as artificial neural networks with backpropagation do typically).
SGDClassifier
allows
to sweep through the data in mini-batches, which is helpful when the data is too large to fit into memory. Cross-validation is not compatible with this technique; insteadprogressive
validation is used: here, the estimator is tested always on the next chunk of training data (before seeing it for training). After training, it is tested again to check how well it has adapted to the data.
X, y = make_classification(200000, n_features=200, n_informative=25,
n_redundant=0, n_classes=10, class_sep=2,
random_state=0)
from sklearn.linear_model import SGDClassifier
est = SGDClassifier(penalty="l2", alpha=0.001)
progressive_validation_score = []
train_score = []
for datapoint in range(0, 199000, 1000):
X_batch = X[datapoint:datapoint+1000]
y_batch = y[datapoint:datapoint+1000]
if datapoint > 0:
progressive_validation_score.append(est.score(X_batch, y_batch))
est.partial_fit(X_batch, y_batch, classes=range(10))
if datapoint > 0:
train_score.append(est.score(X_batch, y_batch)) plt.plot(train_score, label="train score")
plt.plot(progressive_validation_score, label="progressive validation score")
plt.xlabel("Mini-batch")
plt.ylabel("Score")
plt.legend(loc='best')
<matplotlib.legend.Legend at 0x7f6a24e2dfd0>
This plot tells us that after 50 mini-batches of data we are no longer improving on the validation data and could thus also stop training. Since the train score is not considerably larger, we are probably underfitting rather than overfitting. It would be nice
to test an rbf kernel but SGDClassifier
is
unfortunately incompatible with the kernel trick. Alternatives would be to use a multi-layer perceptron, which can also be trained with stochastic gradient descent but is a non-linear model, or to use kernel-approximation, as suggested by the cheat-sheet.
Now for one of the classic datasets used in machine learning, which deals with optical digit recognition:
from sklearn.datasets import load_digits
digits = load_digits(n_class=6)
X = digits.data
y = digits.target
n_samples, n_features = X.shape
print "Dataset consist of %d samples with %d features each" % (n_samples, n_features) # Plot images of the digits
n_img_per_row = 20
img = np.zeros((10 * n_img_per_row, 10 * n_img_per_row))
for i in range(n_img_per_row):
ix = 10 * i + 1
for j in range(n_img_per_row):
iy = 10 * j + 1
img[ix:ix + 8, iy:iy + 8] = X[i * n_img_per_row + j].reshape((8, 8)) plt.imshow(img, cmap=plt.cm.binary)
plt.xticks([])
plt.yticks([])
_ = plt.title('A selection from the 8*8=64-dimensional digits dataset')
Dataset consist of 1083 samples with 64 features each
We have thus 1083 examples of hand-written digits (0, 1, 2, 3, 4, 5), where each of those consists of an 8×8 gray-scale
image of 4-bit pixels (0, 16). The number of feature dimension is thus moderate (64); nevertheless, illustrating this 64-dimensional space is non-trivial. We illustrate different methods for reducing dimensionality (to two dimensions), based on http://scikit-learn.org/stable/auto_examples/manifold/plot_lle_digits.html#example-manifold-plot-lle-digits-py:
Already a random projection of the data to two dimensions gives a not too bad impression:
from sklearn import (manifold, decomposition, random_projection)
rp = random_projection.SparseRandomProjection(n_components=2, random_state=42)
stime = time.time()
X_projected = rp.fit_transform(X)
plot_embedding(X_projected, "Random Projection of the digits (time: %.3fs)" % (time.time() - stime))
However, there is a well-known technique that should be better suited in general, namely PCA (implemented using a TruncatedSVD which does not require constructing the covariance matrix):
X_pca = decomposition.TruncatedSVD(n_components=2).fit_transform(X)
stime = time.time()
plot_embedding(X_pca,
"Principal Components projection of the digits (time: %.3fs)" % (time.time() - stime))
PCA gives better results and is even faster on this dataset. We could do even better by allowing non-linear transformations from our 64-dimensional input space to the 2-dimensional target space. There exists many methods for this; we only present one of them
here: t-SNE
tsne = manifold.TSNE(n_components=2, init='pca', random_state=0)
stime = time.time()
X_tsne = tsne.fit_transform(X)
plot_embedding(X_tsne,
"t-SNE embedding of the digits (time: %.3fs)" % (time.time() - stime))
Now this is a vastly superior embedding which also shows that it should be possible to separate these classes almost perfectly by a classifier (see, e.g., http://scikit-learn.org/stable/auto_examples/plot_digits_classification.html).
The only disadvantage of t-SNE is that it takes considerably more time to be computed and thus does not scale to large datasets (in the current implementation).
Choice of the loss function
The choice of the loss function is also quite important. Here is an illustration of different loss functions:
# adapted from http://scikit-learn.org/stable/auto_examples/linear_model/plot_sgd_loss_functions.html
xmin, xmax = -4, 4
xx = np.linspace(xmin, xmax, 100)
plt.plot([xmin, 0, 0, xmax], [1, 1, 0, 0], 'k-',
label="Zero-one loss")
plt.plot(xx, np.where(xx < 1, 1 - xx, 0), 'g-',
label="Hinge loss")
plt.plot(xx, np.log2(1 + np.exp(-xx)), 'r-',
label="Log loss")
plt.plot(xx, np.exp(-xx), 'c-',
label="Exponential loss")
plt.plot(xx, -np.minimum(xx, 0), 'm-',
label="Perceptron loss")
# the balanced relative margin machine
#R = 2
#plt.plot(xx, np.where(xx < 1, 1 - xx, (np.where(xx > R, xx-R,0))), 'b-',
# label="L1 Balanced Relative Margin Loss")
plt.ylim((0, 8))
plt.legend(loc="upper right")
plt.xlabel(r"Decision function $f(x)$")
plt.ylabel("$L(y, f(x))$")
<matplotlib.text.Text at 0x7f6a2879cf90>
The different loss functions have different advantages:
- the zero-one loss is what you actually want in classification. Unfortunately it is non-convex and thus not practical since the optimization problem becomes more or less intractable
- the hinge loss (used in support-vector classification) results in solutions which are sparse in the data (due to it being zero for f(x)>1)
and is relatively robust to outliers (it grows only linearly for f(x)→−∞)
. It doesn't provide well-calibrated probabilities. - the log-loss (used, e.g., in logistic regression) results in well calibrated probabilities. It is thus the loss of choice if you don't want only binary predictions but also probabilities for the outcomes. On the downside, it's solutions are not sparse in the
data space and it is more influenced by outliers than the hinge loss. - the exponential loss (used in AdaBoost) is very susceptible to outliers (due to its rapid increase when f(x)→−∞).
It is primarily used in AdaBoost since it results there in a simple and efficient boosting algorithm. - the perceptron loss is basically a shifted version of the hinge loss. The hinge loss also penalizes points which are on the correct side of the boundary but very close to it (maximum-margin principle). The perceptron loss, on the other hand, is happy as long
as a datapoint is on the correct side of the boundary, which leaves the boundary under-determined if the data is truly linearly separable and results in worse generalization than a maximum-margin boundary.
Summary
We have discussed some recommendations of how to get machine learning working on a new problem. We have looked at classification problems but regression and clustering can be addressed similarly. Hoewever, the focus on artificial datasets was (while being easily
to understand) also somewhat oversimplifying. On many actual problems, the collection, organisation, and preprocessing of the data are of uttermost importance. See for instance this article on data
wrangling. pandas is a great tool for this.
Many application domains also come with specific requirements and tools which are tailored to these demands, e.g.:
- image-processing with skimage
- biosignal analysis and general time-series processing with pySPACE
- financial data with pandas
We don't explore these areas in detail; however, the effort that needs to be invested into a good pre-processing pipeline often exceeds the effort required for selecting an appropriate classifier considerably. A first impression of a moderately complex signal
processing pipeline can be obtained from a pySPACE example for detecting a specific event-related potential in EEG data: https://github.com/pyspace/pyspace/blob/master/docs/examples/specs/node_chains/ref_P300_flow.yaml
This signal processing pipeline contains nodes for data standardization, decimation, band-pass filtering, dimensionality reduction (xDAWN is a supervised method for this), feature extraction (Local_Straightline_Features), and feature normalization. The following
graphic gives an overview over different nodes in pySPACE that can be applied in a pipeline prior to classification:
Image(filename='algorithm_types_detailed.png', width=800, height=600)
One of the long-term goals of machine learning, which is pursued among others in the field of deep learning, is to allow to learn large parts of such pipelines rather than to hand-engineer them.
%load_ext watermark
%watermark -a "Jan Hendrik Metzen" -d -v -m -p numpy,scikit-learn
Jan Hendrik Metzen 29/01/2015 CPython 2.7.9
IPython 2.1.0 numpy 1.9.1
scikit-learn 0.14.1 compiler : GCC 4.4.7 20120313 (Red Hat 4.4.7-1)
system : Linux
release : 3.16.0-28-generic
machine : x86_64
processor : x86_64
CPU cores : 4
interpreter: 64bit
This post was written as an IPython notebook. You can download this notebook.
Posted by Jan
Hendrik Metzen 2015-01-29 python classification machine-learning tutorial
Advice for applying Machine Learning的更多相关文章
- Machine Learning - 第6周(Advice for Applying Machine Learning、Machine Learning System Design)
In Week 6, you will be learning about systematically improving your learning algorithm. The videos f ...
- 斯坦福大学公开课机器学习:advice for applying machine learning | diagnosing bias vs. variance(机器学习:诊断偏差和方差问题)
当我们运行一个学习算法时,如果这个算法的表现不理想,那么有两种原因导致:要么偏差比较大.要么方差比较大.换句话说,要么是欠拟合.要么是过拟合.那么这两种情况,哪个和偏差有关.哪个和方差有关,或者是不是 ...
- 【原】Coursera—Andrew Ng机器学习—课程笔记 Lecture 10—Advice for applying machine learning 机器学习应用建议
Lecture 10—Advice for applying machine learning 10.1 如何调试一个机器学习算法? 有多种方案: 1.获得更多训练数据:2.尝试更少特征:3.尝试更多 ...
- (原创)Stanford Machine Learning (by Andrew NG) --- (week 6) Advice for Applying Machine Learning & Machine Learning System Design
(1) Advice for applying machine learning Deciding what to try next 现在我们已学习了线性回归.逻辑回归.神经网络等机器学习算法,接下来 ...
- Coursera 机器学习 第6章(上) Advice for Applying Machine Learning 学习笔记
这章的内容对于设计分析假设性能有很大的帮助,如果运用的好,将会节省实验者大量时间. Machine Learning System Design6.1 Evaluating a Learning Al ...
- 斯坦福机器学习视频笔记 Week6 关于机器学习的建议 Advice for Applying Machine Learning
我们将学习如何系统地提升机器学习算法,告诉你学习算法何时做得不好,并描述如何'调试'你的学习算法和提高其性能的“最佳实践”.要优化机器学习算法,需要先了解可以在哪里做最大的改进. 我们将讨论如何理解具 ...
- 斯坦福第十课:应用机器学习的建议(Advice for Applying Machine Learning)
10.1 决定下一步做什么 10.2 评估一个假设 10.3 模型选择和交叉验证集 10.4 诊断偏差和方差 10.5 归一化和偏差/方差 10.6 学习曲线 10.7 决定下一步做什么 ...
- 斯坦福大学公开课机器学习: advice for applying machine learning | deciding what to try next(revisited)(针对高偏差、高方差问题的解决方法以及隐藏层数的选择)
针对高偏差.高方差问题的解决方法: 1.解决高方差问题的方案:增大训练样本量.缩小特征量.增大lambda值 2.解决高偏差问题的方案:增大特征量.增加多项式特征(比如x1*x2,x1的平方等等).减 ...
- 斯坦福大学公开课机器学习:advice for applying machine learning | learning curves (改进学习算法:高偏差和高方差与学习曲线的关系)
绘制学习曲线非常有用,比如你想检查你的学习算法,运行是否正常.或者你希望改进算法的表现或效果.那么学习曲线就是一种很好的工具.学习曲线可以判断某一个学习算法,是偏差.方差问题,或是二者皆有. 为了绘制 ...
随机推荐
- [转]Hive/Beeline 使用笔记
FROM : http://www.7mdm.com/1407.html Hive: 利用squirrel-sql 连接hive add driver -> name&example u ...
- C++ 排序函数 sort(),qsort()的用法
转自:http://blog.csdn.net/zzzmmmkkk/article/details/4266888/ 所以自己总结了一下,首先看sort函数见下表: 函数名 功能描述 sort 对给定 ...
- Django基础 - Debug设置为False后静态文件获取404
当设置setting.py文件当中的DEBUG=FALSE后,Django会默认使用Web Server的静态文件处理,故若没设置好Web Server对静态文件的处理的话,会出现访问静态文件404的 ...
- 20135316王剑桥 linux第十一周课实验笔记
getenv函数 1.获得环境变量值的函数 2.参数是环境变量名name,例如"HOME"或者"PATH".如果环境变量存在,那么getenv函数会返回环境变量 ...
- Cordova开发环境的搭建
Cordova开发环境的搭建 原文地址:http://imziv.com/blog/article/read.htm?id=66 Cordova为目前做混合式开发中比较受欢迎的一个解决方案了,并且拥有 ...
- Hibernate一级缓存与二级缓存的区别
一级缓存: 就是Session级别的缓存.一个Session做了一个查询操作,它会把这个操作的结果放在一级缓存中. 如果短时间内这个session(一定要同一个session)又做了同一个操作,那么h ...
- nodejs初学————安装篇(iis8.5+windows8.1)
nodejs很久前就想玩玩,不过一直没时间,昨晚花了4个小时来捣鼓到iis上架设成功了,废话不说了. PS:我的系统是windows8.1 x64,所以自带iis8.5的,下载的文件也是x64的. N ...
- grootJs 系统常用API接受
groot.absUrl(url) 把相对路径转换为绝对路径 groot.model 把vm对象转换为json 去掉系统生成的的属性groot.model groot.log 输出到控制台 兼容低版本 ...
- 自己写了个H5版本的俄罗斯方块
在实习公司做完项目后,实在无聊.就用H5写了几个游戏出来玩一下.从简单的做起,就搞了个经典的俄罗斯方块游戏. 先上效果: 上面的数字是得分,游戏没有考虑兼容性,只在chrome上测试过,不过大部分现代 ...
- 安装VS2008无法更改安装路径解决方法
一直用VS2012 以及 VS2012开发,但是他们都不支持Wince程序的开发,所有要安装VS2008.但是发现VS2008只能安装在C盘,要知道C空间很宝贵的. 经过查找资料发现系统中已经安装了V ...