zorka和zico的代码地址:https://github.com/jitlogic

由于zico是zorka的collecter端,所以在介绍zico之前首先说一下zorka和数据结构化存储和传输的部分。zorka抓取到数据后,会封装成一条TraceRecord,TraceRecord中包含本条trace的类,方法,调用链等信息,但是这些信息都是将对应的ID存在TraceRecord,而ID对应的详细信息会存在Symbol中,也就是说Symbol相当于一个数据字典。在传送和存储的之前会进行打TAG操作,已经来标记一条信息是Symbol的一个条目或者是TraceRecord中的一个条目。

首先,先试着理一下Zico工作的主要步骤:

ZicoServer通过socket建立连接,在run()方法中新建了ZicoServerConnector(继承自ZicoConnector),ZicoServerConnector中的核心部分是runCycle()函数。

//此方法在run方法中被调用
private void runCycle() throws IOException {
ZicoPacket pkt = recv();
switch (pkt.getStatus()) {
case ZICO_PING: {
send(ZICO_PONG);
break;
}
case ZICO_HELLO: {
List<Object> lst = ZicoCommonUtil.unpack(pkt.getData());
log.debug("Encountered ZICO HELLO packet: " + lst + "(addr=" + saddr + ")");
if (lst.size() > 0 && lst.get(0) instanceof HelloRequest) {
context = factory.get(socket, (HelloRequest) lst.get(0));
send(ZICO_OK);
} else {
log.error("ZICO_HELLO packet with invalid content: " + lst + "(addr=" + addr + ")");
send(ZICO_BAD_REQUEST);
}
break;
}
case ZICO_DATA: {
//log.debug("Received ZICO data packet from " + saddr + ": status=" + pkt.getStatus()
// + ", dlen=" + pkt.getData().length);
if (context != null) {
for (Object o : ZicoCommonUtil.unpack(pkt.getData())) {
context.process(o);
}
context.commit();
send(ZICO_OK);
} else {
log.error("Client " + saddr + " not authorized.");
send(ZICO_AUTH_ERROR);
}
break;
}
default:
log.error("ZICO packet from " + saddr + " with invalid status code: " + pkt.getStatus());
send(ZICO_BAD_REQUEST);
break;
}
}

recv()函数是将通过socket读取的数据封装成一个ZicoPacket对象返回。在连接建立的初期,zico和zorka之间会发送ping-pong报文以及hello报文来测试建立的连接。ZicoPacket中的getData()方法返回一个byte数组,ZicoCommonUtil中的unpack()方法会将数据从byte数组中解包(unpack list of fressian-encoded objects from byte array)。unpack()函数执行的返回结果是List<object>,ZicoDataProcessor将对这个List<object>执行process()方法和commit()方法。

只有ReceiverContext实现了ZicoDataProcessor接口,下面我们看一下ReceiverContext这个类。

 @Override
public synchronized void process(Object obj) throws IOException { if (hostStore.hasFlag(HostInfo.DELETED)) {
log.info("Resetting connection for " + hostStore.getName() + " due to dirty SID map.");
throw new ZicoException(ZicoPacket.ZICO_EOD,
"Host has been deleted. Connection needs to be reset. Try again.");
} if (hostStore.hasFlag(HostInfo.DISABLED)) {
// Host store is disabled. Ignore incoming packets.
return;
} if (dirtySidMap) {
log.info("Resetting connection for " + hostStore.getName() + " due to dirty SID map.");
throw new ZicoException(ZicoPacket.ZICO_EOD,
"Host was disabled, then enabled and SID map is dirty. Resetting connection.");
} try {
if (obj instanceof Symbol) {
processSymbol((Symbol) obj);
} else if (obj instanceof TraceRecord) {
processTraceRecord((TraceRecord) obj);
} else {
if (obj != null) {
log.warn("Unsupported object type:" + obj.getClass());
} else {
log.warn("Attempted processing NULL object (?)");
}
}
} catch (Exception e) {
log.error("Error processing trace record: ", e);
}
}

ReceiverContext中包含一个HostStore(Represents performance data store for a single agent)对象。

下面,我们看一下ReceiverContext中的processTraceRecord()方法。

private void processTraceRecord(TraceRecord rec) throws IOException {
if (!hostStore.hasFlag(HostInfo.DISABLED)) {
rec.traverse(this);
visitedObjects.clear();
hostStore.processTraceRecord(rec);
} else {
log.debug("Dropping trace for inactive host: " + hostStore.getName());
}
}

此方法中调用了TraceRecord的tracerse方法。

   public void traverse(MetadataChecker checker) throws IOException {
classId = checker.checkSymbol(classId, this);
methodId = checker.checkSymbol(methodId, this);
signatureId = checker.checkSymbol(signatureId, this); if (exception instanceof SymbolicException) {
((SymbolicException) exception).traverse(checker);
} if (attrs != null) {
Map<Integer, Object> newAttrs = new LinkedHashMap<Integer, Object>();
for (Map.Entry<Integer, Object> e : attrs.entrySet()) {
newAttrs.put(checker.checkSymbol(e.getKey(), this), e.getValue());
}
attrs = newAttrs;
} if (children != null) {
for (TraceRecord child : children) {
child.traverse(checker);
}
} if (marker != null && 0 != (flags & TRACE_BEGIN)) {
marker.traverse(checker);
}
}

获取了三个id是****,读取了traceRecord及其孩子的attrs(Attributes grabbed at this method execution (by spy instrumentation engine))。

  在processTraceRecord()方法中调用了HostStore类中的ProcessTraceRecord(TraceRecord obj)方法,在其中将TraceRecord(Represents trace information about single method execution,May contain references to information about calls from this method)包装成了TraceInfoRecord,TraceInfoRecord比TraceRecord多了四个属性:dataOffs,dataLen,indexOffs,indexLen。个人猜测用来控制块操作的。

  在HostStore类的processTraceRecord()方法中将TraceRecord进行了更加详细的解析,将TraceRecord的对象分解为DataSore,IndexStore,infos等并且分别写入文件。

    public void processTraceRecord(TraceRecord rec) throws IOException {

        TraceRecordStore traceDataStore = getTraceDataStore();
TraceRecordStore traceIndexStore = getTraceIndexStore();
BTreeMap<Long,TraceInfoRecord> infos = getInfos();
Map<Integer,String> tids = getTids(); if (traceDataStore == null || traceIndexStore == null || infos == null || tids == null
|| hasFlag(HostInfo.DISABLED|HostInfo.DELETED)) {
throw new ZicoRuntimeException("Store " + getName() + " is closed and cannot accept records.");
} TraceRecordStore.ChunkInfo dchunk = traceDataStore.write(rec); List<TraceRecord> tmp = rec.getChildren(); int numRecords = ZicoUtil.numRecords(rec); rec.setChildren(null);
TraceRecordStore.ChunkInfo ichunk = traceIndexStore.write(rec);
rec.setChildren(tmp); TraceInfoRecord tir = new TraceInfoRecord(rec,numRecords,
dchunk.getOffset(), dchunk.getLength(),
ichunk.getOffset(), ichunk.getLength(),
rec.getAttrs() != null ? ZicoUtil.toIntArray(rec.getAttrs().keySet()) : null); checkAttrs(tir); infos.put(tir.getDataOffs(), tir); int traceId = tir.getTraceId(); if (!tids.containsKey(traceId)) {
tids.put(traceId, symbolRegistry.symbolName(traceId));
} }

  到此,总结一下数据从sokect读取到写入文件的过程,ZicoServerConnector中负责根据socket生成inputStream从socket中读取数据,此时读取的数据是fressian编码的byte数组,ZicoCommonUtil类的unpack()方法将这个byte数组解析为一个List<Object>,ReceiverContext类进一步将这些Object区分为Symbol和TraceRecord,而HostStore进一步将TraceRecord解析为更加详细的信息,也就是对应下面的目录中的信息。

zico将收到的agent端的数据存储在zico/data文件下,目录结构如下图所示:

tdat中***存储的是host列表。

tidx:***(还不清楚)

host.properties中存储着host相关的信息(name,IP,group等等)

symbol.dat***

trace.db****

HostStore类代表一个agent的性能数据的存储(Represents performance data store for a single agent)。在HostStore类的构造函数中调用了open()函数,我们可以根据以下代码分析目录结构

    public synchronized void open() {
try {
load();
if (symbolRegistry == null) {
symbolRegistry = new PersistentSymbolRegistry(
new File(ZicoUtil.ensureDir(rootPath), "symbols.dat"));
} if (traceDataStore == null) {
traceDataStore = new TraceRecordStore(config, this, "tdat", 1, this);
} if (traceIndexStore == null) {
traceIndexStore = new TraceRecordStore(config, this, "tidx", 4);
} db = dbf.openDB(ZorkaUtil.path(rootPath, "traces.db"));

//private BTreeMap<Long, TraceInfoRecord> infos;
             infos = db.getTreeMap(DB_INFO_MAP);
             //private Map<Integer, String> tids;
             tids = db.getTreeMap(DB_TIDS_MAP);
             //private BTreeMap<Integer,Set<Integer>> attrs;
             attrs = db.getTreeMap(DB_ATTR_MAP);

        } catch (IOException e) {
log.error("Cannot open host store " + name, e);
}
}

PersistentSymbolRegistry类继承自SymbolRegistry类,而SymbolRegistry类中包含了两个Map(symbolIds以及symbolNames)。PersistentSymbolRegistry类中的open()方法会从symbol.dat文件中读取symbol,并put进symbolIds和symbolNames。

TraceReocrdStore类在其构造函数中会新建一个RDSSotre(Raw Data Store (RDS) builds upon RAGZ stream classes)对象。在RSSsotre的open()方法中,会建立RAGZInputStream和RAGZOutputStream来读写RAGZSegment(represents a single in-memory segment)。RAGZSegment类有unpack()方法对数据进行解压,此处调用了java.util.zip包。

  在zico-core包的pom文件中, 有以下语句

        <dependency>
<groupId>org.mapdb</groupId>
<artifactId>mapdb</artifactId>
<version>${mapdb.version}</version>
</dependency>

  此处将mapdb导入了maven项目。对于trace.db文件使用了org.parboiled.BaseParser。  http://parboiled.org。这是一个轻量级no-SQL数据库

 第二部分

zico使用了google guice IOC,在ProdZicoModule中有如下语句

@Override
public void configure(Binder binder) {
super.configure(binder);
binder.bind(UserManager.class).asEagerSingleton();
binder.bind(UserContext.class).to(UserHttpContext.class);
binder.bind(DBFactory.class).to(FileDBFactory.class);
binder.bind(ZicoDataProcessorFactory.class).to(HostStoreManager.class);
}

先分析最后一句话,ZicoDataProcessorFactory接口中只有一个get方法,此方法在HostStoreManager方法中实现如下

  @Override
public ZicoDataProcessor get(Socket socket, HelloRequest hello) throws IOException { if (hello.getHostname() == null) {
log.error("Received HELLO packet with null hostname.");
throw new ZicoException(ZicoPacket.ZICO_BAD_REQUEST, "Null hostname.");
} HostStore store = getHost(hello.getHostname(), !enableSecurity); if (store == null) {
throw new ZicoException(ZicoPacket.ZICO_AUTH_ERROR, "Unauthorized.");
} if (store.getAddr() == null || store.getAddr().length() == 0) {
store.setAddr(socket.getInetAddress().getHostAddress());
store.save();
} if (enableSecurity) {
if (store.getAddr() != null && !store.getAddr().equals(socket.getInetAddress().getHostAddress())) {
throw new ZicoException(ZicoPacket.ZICO_AUTH_ERROR, "Unauthorized.");
} if (store.getPass() != null && !store.getPass().equals(hello.getAuth())) {
throw new ZicoException(ZicoPacket.ZICO_AUTH_ERROR, "Unauthorized.");
}
} return new ReceiverContext(store);
}

函数里面主要做了查找hostStore,并返回和这个hostStore相关的ReceiverContext。ReceiverContext实现了ZicoDataProcessor的process()和commit()方法。

 第三部分:分析一下TraceDataService相关

***

HostStore中有search()函数:

 public TraceInfoSearchResult search(TraceInfoSearchQuery query) throws IOException {

        SymbolRegistry symbolRegistry = getSymbolRegistry();
BTreeMap<Long,TraceInfoRecord> infos = getInfos();
TraceRecordStore traceDataStore = getTraceDataStore();
TraceRecordStore traceIndexStore = getTraceIndexStore(); if (symbolRegistry == null || infos == null || traceDataStore == null || traceIndexStore == null) {
throw new ZicoRuntimeException("Host store " + getName() + " is closed.");
} List<TraceInfo> lst = new ArrayList<TraceInfo>(query.getLimit()); TraceInfoSearchResult result = new TraceInfoSearchResult();
result.setSeq(query.getSeq());
result.setResults(lst); TraceRecordMatcher matcher = null; int traceId = query.getTraceName() != null ? symbolRegistry.symbolId(query.getTraceName()) : 0; if (query.getSearchExpr() != null) {
if (query.hasFlag(TraceInfoSearchQuery.EQL_QUERY)) {
matcher = new EqlTraceRecordMatcher(symbolRegistry,
Parser.expr(query.getSearchExpr()),
0, 0, getName());
} else if (query.getSearchExpr().length() > 0 && query.getSearchExpr().startsWith("~")) {
matcher = new FullTextTraceRecordMatcher(symbolRegistry,
TraceRecordSearchQuery.SEARCH_ALL, Pattern.compile(query.getSearchExpr().substring(1)));
} else {
matcher = new FullTextTraceRecordMatcher(symbolRegistry,
TraceRecordSearchQuery.SEARCH_ALL, query.getSearchExpr());
}
} // TODO implement query execution time limit int searchFlags = query.getFlags(); boolean asc = 0 == (searchFlags & TraceInfoSearchQuery.ORDER_DESC); Long initialKey = asc
? infos.higherKey(query.getOffset() != 0 ? query.getOffset() : Long.MIN_VALUE)
: infos.lowerKey(query.getOffset() != 0 ? query.getOffset() : Long.MAX_VALUE); long tstart = System.nanoTime(); for (Long key = initialKey; key != null; key = asc ? infos.higherKey(key) : infos.lowerKey(key)) { long t = System.nanoTime()-tstart; if ((lst.size() >= query.getLimit()) || (t > MAX_SEARCH_T1 && lst.size() > 0) || (t > MAX_SEARCH_T2)) {
result.markFlag(TraceInfoSearchResult.MORE_RESULTS);
return result;
} TraceInfoRecord tir = infos.get(key); result.setLastOffs(key); if (query.hasFlag(TraceInfoSearchQuery.ERRORS_ONLY) && 0 == (tir.getTflags() & TraceMarker.ERROR_MARK)) {
continue;
} if (query.getStartDate() != 0 && tir.getClock() < query.getStartDate()) {
continue;
} if (query.getEndDate() != 0 && tir.getClock() > query.getEndDate()) {
continue;
} if (traceId != 0 && tir.getTraceId() != traceId) {
continue;
} if (tir.getDuration() < query.getMinMethodTime()) {
continue;
} TraceRecord idxtr = (query.hasFlag(TraceInfoSearchQuery.DEEP_SEARCH) && matcher != null)
? traceDataStore.read(tir.getDataChunk())
: traceIndexStore.read(tir.getIndexChunk()); if (idxtr != null) {
if (matcher instanceof EqlTraceRecordMatcher) {
((EqlTraceRecordMatcher) matcher).setTotalTime(tir.getDuration());
}
if (matcher == null || recursiveMatch(matcher, idxtr)) {
lst.add(toTraceInfo(tir, idxtr));
}
} } return result;
}

//DataReceptionUnitTest中可以看到HostStore中包含一个SymbolRegistry。

//zico在和agent端进行交互时,使用了RESTful风格。

zico源代码分析(一) 数据接收和存储部分的更多相关文章

  1. RTMPdump(libRTMP) 源代码分析 9: 接收消息(Message)(接收视音频数据)

    ===================================================== RTMPdump(libRTMP) 源代码分析系列文章: RTMPdump 源代码分析 1: ...

  2. zico源代码分析(二) 数据读取和解析部分

    第一部分:分析篇 首先,看一下zico的页面,左侧是hostname panel,右侧是该主机对应的traces panel. 点击左侧zorka主机名,右侧panel会更新信息,在火狐浏览器中使用f ...

  3. 10.Spark Streaming源码分析:Receiver数据接收全过程详解

    原创文章,转载请注明:转载自 听风居士博客(http://www.cnblogs.com/zhouyf/)   在上一篇中介绍了Receiver的整体架构和设计原理,本篇内容主要介绍Receiver在 ...

  4. Zico源代码分析:执行启动过程分析和总结

    事实上已经有童鞋对Zico的源代码和执行过程进行了总结,比如:http://www.cnblogs.com/shuaiwang/p/4522905.html.这里我再补充一些内容. 当我们使用mvn ...

  5. openVswitch(OVS)源代码分析之工作流程(数据包处理)

    上篇分析到数据包的收发,这篇开始着手分析数据包的处理问题.在openVswitch中数据包的处理是其核心技术,该技术分为三部分来实现:第一.根据skb数据包提取相关信息封装成key值:第二.根据提取到 ...

  6. RTMPdump(libRTMP) 源代码分析 8: 发送消息(Message)

    ===================================================== RTMPdump(libRTMP) 源代码分析系列文章: RTMPdump 源代码分析 1: ...

  7. RTMPdump(libRTMP)源代码分析 4: 连接第一步——握手(Hand Shake)

    ===================================================== RTMPdump(libRTMP) 源代码分析系列文章: RTMPdump 源代码分析 1: ...

  8. RTMPdump(libRTMP) 源代码分析 10: 处理各种消息(Message)

    ===================================================== RTMPdump(libRTMP) 源代码分析系列文章: RTMPdump 源代码分析 1: ...

  9. RTMPdump(libRTMP) 源代码分析 7: 建立一个流媒体连接 (NetStream部分 2)

    ===================================================== RTMPdump(libRTMP) 源代码分析系列文章: RTMPdump 源代码分析 1: ...

随机推荐

  1. POJ-2240 Arbitrage BellmanFord查可循环圈

    题目链接:https://cn.vjudge.net/problem/POJ-2240 题意 套利(Arbitrage)就是通过不断兑换外币,使得自己钱变多的行为 给出一些汇率 问能不能套利 思路 马 ...

  2. java list序列化json 对象、json数组

    list<T> 序列化   json对象   ----------- JSONObject -------JSONObject.toJSONString(str);  解析:JSONObj ...

  3. AES对称加密util

    package cn.com.qmhd.oto.common; import java.security.Key; import java.security.NoSuchAlgorithmExcept ...

  4. POJ——T2117 Electricity

     http://poj.org/problem?id=2117 Time Limit: 5000MS   Memory Limit: 65536K Total Submissions: 5459   ...

  5. 实现图像剪裁 jquery.Jcrop

       配合 jquery.Jcrop 实现上传图片进行剪裁保存功能    <script src="js/jquery.min.js"></script> ...

  6. 从头认识java-18.6 synchronized在其它对象上同步和ThreadLocal来消除共享对象的同步问题

    这一章节我们来介绍在其它对象上同步与ThreadLocal. 前一章节我们使用了 1.synchronized在其它对象上同步 class ThreadA implements Runnable { ...

  7. [MST] Defining Asynchronous Processes Using Flow

    In real life scenarios, many operations on our data are asynchronous. For example, because additiona ...

  8. Struts2学习(三)上传下载

    今天记录一下利用struts2实现上传下载,借此案例说明一下struts2的开发流程. 须要注意的是struts2版本号不同非常多地方的写法是不同的.本例使用struts2.3.15 .有差别的地方文 ...

  9. 从Oracle Database 角度来看浪潮天梭K1主机的操作系统选择

    背景: 浪潮天梭k1主机.事实上分好几个类别: K1-950 intel 安腾cpu K1-930 intel 安腾cpu K1-910 intel 安腾cpu K1-800 intel 志强cpu ...

  10. C# 解压及压缩文件源代码

    using System.IO; using System.Windows.Forms; using ICSharpCode.SharpZipLib.Zip; using ICSharpCode.Sh ...