不就ideas嘛,谁没有!
20160214
survey of current RDF triple storage systems
survey of semantic web stack inference mechanism
embrace semantic web in giant data processing:
graph computing?
graph database search transformation?
reasoning mechanism modified?
20160215
play Apache Jena with JDBC
play Apache Jena with Text Search
play Apache Jena with Apache Hadoop, finally
20160216
Top Machine Learning, Data Mining, & NLP Books that Every Data Scientist Should Read: 机器学习、数据挖掘、NLP书籍
20160301
Apache Jena jena-elephas-*似乎只是提供了一种RDF中节点、关系等统计信息的方法,
如何执行类似与SPARQL查询、RDF推理?
RDF查询和推理算法的实现需要基于图的存储和处理算法支持。
直接切换为Neo4j或现有Hadoop生态系统的已有图算法框架?
20160305
Notes of 大数据智能里记录了一些图处理算法。
Let's rock and roll.
20160306
Google学术搜索,资源比你想象的还多。
Chrome有个插件“Google Scholar Button”。
Maven仓库搜索页面中已经有SBT依赖了,Scala生态系统正成为一股不可忽视的势力。
20160307
- 剑桥大学NLIP组S.Teufel《词汇语义学》8讲. Lexical Semantics
1歧义,词义,蕴涵
2wordnet和词义消歧(WSD)的有/无/半监督算法
3词汇的上下文、部分与整体关系,基于wordnet/图的消歧方法
4上下文,语义空间与语义相似
5动词的语义,FrameNet
6比喻(暗喻,反语,幽默)
7形容词的语义倾向,反义词组,情感检测
8: Applications based on lexical semantics.
- 机器学习入门资源不完全汇总: 2014-10-14版, 好东西传送门编辑整理
20160310
1 大数据的教育产业? 5980元抢购加入大数据IMF传奇行动!
2 集群调度架构的演变:The evolution of cluster scheduler architectures.
3 Bloom Filters的教程和相关资源: Bloom Filters by Example
4 BSP(Bulk Sysnchronous Parallel) wiki: Bulk synchronous parallel
5 Spark GraphFrame
基于Spark DataFrame的图数据库GraphFrame:用Spark SQL查询Graph
20160311
20160312
1 复杂系统的建模与分析 - a book
Sayama H.. Introduction to the Modeling and Analysis of Complex Systems. 2015.
2 概率图模型导论 - a lecture notes
Koller D., Friedman N. et al. Graphical Models in a Nutshell.
Koller D., Friedman N.. 概率图模型:原理与技术. 2015:清华大学出版社。
20160313
这份报告仅能涵盖这么多内容,注意这一点很重要。仅有十篇访谈,远不够详尽:实际上,就每一篇访谈而言,都有几十位其他理论和实践人员通过他们的努力和奉献成功推动该领域发展。这份报告,尽管简短,但是借由这些顶尖人物的双眼,我们得以一睹这一精彩领域。
20160314
1 Training and serving NLP models using Spark MLlib
Our engineering team has built a platform that trains and serves thousands of NLP models, which function in a distributed environment. This allows us to scale out quickly and provide thousands of predictions per second for many clients simultaneously. In this post, we’ll explore the types of problems we’re working to resolve, the processes we follow, and the technology stack we use. This should be helpful for anyone looking to build out or improve their own NLP pipelines.
On our radar: The essential topics and big ideas we’re tracking.
O'Reilly Media跟踪的主题和idea.
20160315
1 大常识知识库中查询技术
Controlling Search in Very large Commonsense Knowledge Bases: A Machine Learning Approach 2016
Very large commonsense knowledge bases (KBs) often have thousands to millions of axioms, of which relatively few are relevant for answering any given query. A large number of irrelevant axioms can easily overwhelm resolution-based theorem provers. Therefore, methods that help the reasoner identify useful inference paths form an essential part of large-scale reasoning systems. In this paper, we describe two ordering heuristics for optimization of reasoning in such systems. First, we discuss how decision trees can be used to select inference steps that are more likely to succeed. Second, we identify a small set of problem instance features that suffice to guide searches away from intractable regions of the search space. We show the efficacy of these techniques via experiments on thousands of queries from the Cyc KB. Results show that these methods lead to an order of magnitude reduction in inference time.
没有下载到原文啊!
2 A survey of MapReduce family system
Sakr S, Liu A, Fayoumi A G. The family of MapReduce and large-scale data processing systems[J]. ACM Computing Surveys (CSUR), 2013, 46(1): 11.
RDF
A.2 MapReduce for Large Scala RDF Processing
Graph
A.3 MapReduce for Large Scala Graph Processing
20160316
1 MOOC深度学习课程列表:课程列表: Deep Learning
2 神经网络Java编程:Neural Network Programming with Java(Packt 2016).
20160317
1 知识表示(Knowledge Representation)的数学背景
相关资料
John F. Sowa, Knowledge Representation: Logical, Philosophical, and Computational Foundations, Brooks Cole Publishing Co., Pacific Grove, CA, ©2000. Actual publication date, 16 August 1999.
Knowledge Representation - Logical, Philosophical, and Computational Foundations
20160321
1 手写数字K-Mean聚类 (Python)
K-Means Clustering on Handwritten Digits
2 机器学习算法中过拟合、欠拟合问题
Overfitting and Underfitting With Machine Learning Algorithms
3 人工神经网络基础
The Essence of Artificial Neural Networks
20160324
1 统计分析(statistical analysis)的技术和语言
2 机器学习任务的调试(Debugging Machine Learning Tasks)
Unlike traditional programs (such as operating systems or word processors) which have large amounts of code, machine learning tasks use programs with relatively small amounts of code (written in machine learning libraries), but voluminous amounts of data. Just like developers of traditional programs debug errors in their code, developers of machine learning tasks debug and fix errors in their data. However, algorithms and tools for debugging and fixing errors in data are less common, when compared to their counterparts for detecting and fixing errors in code. In this paper, we consider classification tasks where errors in training data lead to misclassifications in test points, and propose an automated method to find the root causes of such misclassifications. Our root cause analysis is based on Pearl's theory of causation, and uses Pearl's PS (Probability of Sufficiency) as a scoring metric. Our implementation, Psi, encodes the computation of PS as a probabilistic program, and uses recent work on probabilistic programs and transformations on probabilistic programs (along with gray-box models of machine learning algorithms) to efficiently compute PS. Psi is able to identify root causes of data errors in interesting data sets.
20160326
1 智能系统中深度神经网络的一个入门教程
A Tutorial on Deep Neural Networks for Intelligent Systems
2 Springer用于概率、统计和机器学习的Python书
Python for Probability, Statistics, and Machine Learning
3 卷积神经网络
Notes on Convolutional Neural Networks
A guide to convolution arithmetic for deep learning Github code
Deep Learning for Computer Vision – Introduction to Convolution Neural Networks
20160327
一些技术性博客
1 AWS AWS Big Data Blog
2 IBM Blogs | IBM Big Data & Analytics Hub
3 Google Cloud GOOGLE CLOUD BIG DATA BLOG
4 more
90+ Active Blogs on Analytics, Big Data, Data Mining, Data Science, Machine Learning
20160330
1 Github中一个深度学习中自然语言处理的资源列表
Deep-Learning-for-NLP-Resources
20160401
【编者的话】本文介绍了过去十年谷歌在容器管理方面的实践,包括了Borg,Omega和Kubernetes的历史和架构方面的比较,谷歌在其内部使用容器的概况,以及谷歌试图通过社区来推动其Kubernetes成为容器管理标准的努力。
剖析大型网站技术架构模式,深入讲述大型互联网架构设计的核心原理,全面介绍大型网站架构需要的方方面面知识/技术。
20160404
1 偏向于学界的多租户(multi-tenancy)的文章
A generic (and highly academic) discussion around multi-tenancy
2 深度学习教程,即将MIT Press出版
Deep Learning - An MIT Press book in preparation - Ian Goodfellow, Yoshua Bengio and Aaron Courville
The Deep Learning textbook is a resource intended to help students and practitioners enter the field of machine learning in general and deep learning in particular. The book will be available for sale soon, and will remain available online for free.
一些读书笔记: All of Recurrent Neural Networks and LSTM
20160405
1 编写Shell脚本的好的实践
Good practices for writing shell scripts Or how to care about the people reading them
2 Linux系统调用 The Definitive Guide to Linux System Calls
20160406
网络图理论
Network theory has proven to be a powerful tool in describing and analyzing systems by modelling the relations between their constituent objects. In recent years great progress has been made by augmenting `traditional' network theory. However, existing network representations still lack crucial features in order to serve as a general data analysis tool. These include, most importantly, an explicit association of information with possibly heterogeneous types of objects and relations, and a conclusive representation of the properties of groups of nodes as well as the interactions between such groups on different scales.
In this paper, we introduce a collection of definitions resulting in a framework that, on the one hand, entails and unifies existing network representations (e.g., network of networks, multilayer networks), and on the other hand, generalizes and extends them by incorporating the above features. To implement these features, we first specify the nodes and edges of a finite graph as sets of properties. Second, the mathematical concept of partition lattices is transferred to network theory in order to demonstrate how partitioning the node and edge set of a graph into supernodes and superedges allows to aggregate, compute and allocate information on and between arbitrary groups of nodes. The derived partition lattice of a graph, which we denote by deep graph, constitutes a concise, yet comprehensive representation that enables the expression and analysis of heterogeneous properties, relations and interactions on all scales of a complex system in a self-contained manner. Furthermore, to be able to utilize existing network-based methods and models, we derive different representations of multilayer networks from our framework and demonstrate the advantages of our representation. We exemplify an application of deep graphs using a real world dataset of precipitation measurements.
20160408
1 演化计算(Evolutionary Computation)
Evolutionary Computation: Theory
Evolutionary Computation: Phenotype
Evolutionary Computation: Genotype
Evolutionary Computation: Loop
20160520
1 大规模和高维数据可视化
论文:Visualizing Large-scale and High-dimensional Data
作者个人主页:Big Data Visualization
Github实现:largeVis
20160526
1 领域实体关系挖掘:InfoQ 特定领域实体关系如何挖掘? --高手支招
2 算法是否需要大数据的讨论 :Algorithms That Learn with Less Data Could Expand AI’s Power
中文译文:一味追求大数据是机器学习的误区,我们的算法所需数据更少且速度更快
3 Hadoop可视化与交互式工具:Hadoop可视化与交互式工具:Zeppelin和Hue
4 算法可视化工具:Algorithm Visualizer
20160530
1 2016需要考察的企业技术:The enterprise technologies to watch in 2016
20160602
已移动到Notes of Ideas
不就ideas嘛,谁没有!的更多相关文章
- 34 Sources for Test Ideas
We recommend collecting test ideas continuously from a variety of information sources. Consider the ...
- lazy ideas in programming
lazy形容词,懒惰的,毫无疑问是一个贬义词.但是,对于计算机领域,lazy却是非常重要的优化思想:把任务推迟到必须的时刻,好处是避免重复计算,甚至不计算.本文的目的是抛砖引玉,总结一些编程中的laz ...
- lazy ideas in programming(编程中的惰性思想)
lazy形容词,懒惰的,毫无疑问是一个贬义词.但是,对于计算机领域,lazy却是非常重要的优化思想:把任务推迟到必须的时刻,好处是避免重复计算,甚至不计算.本文的目的是抛砖引玉,总结一些编程中的laz ...
- The note of Developing Innovative Ideas for New Companies Course
This course is free on the Coursera Site,But it only has English version Threee pieces of the course ...
- About & Ideas & Queries
About Blog主现高一,文化课和OI啥都不会 本Blog主太懒,所以很多内容都缩在一个文章里,如数学.图论大礼包 https://wenku.baidu.com/view/56d76029647 ...
- Blog Ideas
Blog Ideas How-to Post Case Studies Product + Service Updates Product Reviews Content Survey Current ...
- Ideas about the future of management
1. Business markets a. greater competition among companies b. increase in power of global companies ...
- 一些新的ideas
k-means可以在不同的聚类点间加入计算该方向类内方差的方法改进,可以获得更好的效果: 可以通过爬虫方法在facebook上爬取与happy.sad相关的图片进行图片情感分类,并通过语义分析的方法提 ...
- [c++] Basic ideas and Style Guide
Get your own compiler: sudo add-apt-repository ppa:ubuntu-toolchain-r/testsudo apt-get updatesudo ap ...
随机推荐
- IE下new Date不支持传参数的解决
在FF gloogle浏览器中 用js实例化Date对象时 各种参数都可以换传啊. var date = new Date("2014-10-1 10:24:31"); var d ...
- 配置DelegatingFilterProxy使用Spring管理filter chain
项目环境:JDK7 + Maven3.04 0. 项目使用springmvc作为controller层 1. 引入spring-security <dependency> <grou ...
- jQuery LigerUI V1.2.3 (包括API和全部源码) 发布
前言 这次版本主要是增加了Panel和Portal组件,并增加了一套皮肤,并解决了部分兼容性的问题,添加了几个功能点. 欢迎使用反馈. 相关链接 API: http://api.lig ...
- Jacoco入门
Jacoco介绍 转自:wangmuming 的博客 Jacoco是一个开源的覆盖率工具.Jacoco可以嵌入到Ant .Maven中,并提供了EclEmma Eclipse插件,也可以使用JavaA ...
- C#中(int)、int.Parse()、int.TryParse()和Convert.ToInt32()的区别 <转>
作者:Statmoon 出处:http://leolis.cnblogs.com/ 在编程过程中,数据转换是经常要用到的,C#中数据转换的方法很多,拿将目标对象转换为整型(int)来讲,有四种方法 ...
- 阿伦学习html5 之Web SQL Database
不知道什么情况, W3C不再维护web SQL Database规范,但是大多浏览器都支持了! Web SQL Database规范页面有着这样的声明 Web SQL Database 规范中定义的三 ...
- GoLang 通过http Post获取数据
func GetPostResponse(url, bodyType string, body *[]byte) (rdata []byte, err error) { b := bytes.NewB ...
- Log4j快速使用精简版
Log4j快速使用精简版 1.导入log4j-1.2.17.jar包 2.在src根目录下创建log4j.properties文件 log4j.rootLogger=INFO, CONSOLE, FI ...
- 理解python的with语句
Python’s with statement provides a very convenient way of dealing with the situation where you have ...
- elasticsearch,python包pyes进行的处理
elasticsearch:高性能搜索引擎,官网:https://www.elastic.co/products/elasticsearch/ 对于它相信大家都不陌生,es的使用已经广泛存在 各大网站 ...