企业搜索引擎开发之连接器connector(二十八)
通常一个SnapshotRepository仓库对象对应一个DocumentSnapshotRepositoryMonitor监视器对象,同时也对应一个快照存储器对象,它们的关联是通过监视器管理对象DocumentSnapshotRepositoryMonitorManagerImpl实现的
DocumentSnapshotRepositoryMonitorManagerImpl类要实现那些行为,先查看其实现接口DocumentSnapshotRepositoryMonitorManager定义的方法规范
/**
* Management interface to {@link DocumentSnapshotRepositoryMonitor} threads.
*
* @since 2.8
*/
public interface DocumentSnapshotRepositoryMonitorManager {
/**
* Ensures all monitor threads are running.
*
* @param checkpoint for the last completed document or null if none have
* been completed.
* @throws RepositoryException
*/
void start(String checkpoint) throws RepositoryException; /**
* Stops all the configured {@link DocumentSnapshotRepositoryMonitor} threads.
*/
void stop(); /**
* Removes persisted state for {@link DocumentSnapshotRepositoryMonitor}
* threads. After calling this {@link DocumentSnapshotRepositoryMonitor}
* threads will no longer be able to resume from where they left off last
* time.
*/
void clean(); /**
* Returns the number of {@link DocumentSnapshotRepositoryMonitor} threads
* that are alive. This method is for testing purposes.
*/
int getThreadCount(); /**
* Returns the {@link CheckpointAndChangeQueue} for this
* {@link DocumentSnapshotRepositoryMonitorManager}
*/
CheckpointAndChangeQueue getCheckpointAndChangeQueue(); /** Returns whether we are after a start() call and before a stop(). */
boolean isRunning(); /**
* Receives information specifying what is guaranteed to be delivered to GSA.
* Every entry in passed in Map is a monitor name and MonitorCheckpoint.
* The monitor of that name can expect that all documents before and including
* document related with MonitorCheckpoint will be delivered to GSA.
* This information is for the convenience and efficiency of the Monitor so
* that it knows how many changes it has to resend. It's valid for a monitor
* to ignore these updates if it feels like it for some good reason.
* FileConnectorSystemMonitor instances use this information to trim their
* file system snapshots.
*/
void acceptGuarantees(Map<String, MonitorCheckpoint> guarantees); /**
* Receives {@link TraversalSchedule} from TraversalManager which is
* {@link TraversalScheduleAware}.
*/
void setTraversalSchedule(TraversalSchedule traversalSchedule);
}
然后再来看DocumentSnapshotRepositoryMonitorManagerImpl类怎么实现上述接口中定义的行为
先来了解相关属性及如何初始化它们的
private volatile TraversalSchedule traversalSchedule; //监控器线程
private final List<Thread> threads =
Collections.synchronizedList(new ArrayList<Thread>());
//监控器映射容器
private final Map<String, DocumentSnapshotRepositoryMonitor> fileSystemMonitorsByName =
Collections.synchronizedMap(new HashMap<String, DocumentSnapshotRepositoryMonitor>());
private boolean isRunning = false; // Monitor threads start in off state.
private final List<? extends SnapshotRepository<? extends DocumentSnapshot>>
repositories; private final File snapshotDir;
private final ChecksumGenerator checksumGenerator;
//CheckpointAndChange对象容器(List)
private final CheckpointAndChangeQueue checkpointAndChangeQueue;
//Change对象容器(阻塞队列)
private final ChangeQueue changeQueue; private final DocumentSnapshotFactory documentSnapshotFactory; /**
* Constructs {@link DocumentSnapshotRepositoryMonitorManagerImpl}
* for the {@link DiffingConnector}.
*
* @param repositories a {@code List} of {@link SnapshotRepository
* SnapshotRepositorys}
* @param documentSnapshotFactory a {@link DocumentSnapshotFactory}
* @param snapshotDir directory to store {@link SnapshotRepository}
* @param checksumGenerator a {@link ChecksumGenerator} used to
* detect changes in a document's content
* @param changeQueue a {@link ChangeQueue}
* @param checkpointAndChangeQueue a
* {@link CheckpointAndChangeQueue}
*/
public DocumentSnapshotRepositoryMonitorManagerImpl(
List<? extends SnapshotRepository<
? extends DocumentSnapshot>> repositories,
DocumentSnapshotFactory documentSnapshotFactory,
File snapshotDir, ChecksumGenerator checksumGenerator,
ChangeQueue changeQueue,
CheckpointAndChangeQueue checkpointAndChangeQueue) {
this.repositories = repositories;
this.documentSnapshotFactory = documentSnapshotFactory;
this.snapshotDir = snapshotDir;
this.checksumGenerator = checksumGenerator;
this.changeQueue = changeQueue;
this.checkpointAndChangeQueue = checkpointAndChangeQueue;
}
下面我们再来看它的start方法,在该方法中,主要动作为分别为调用checkpointAndChangeQueue对象的start方法,初始化各个仓库对象相关联的快照存储对象SnapshotStore,最后是启动各个仓库对象的监控器实例
/**
* 启动方法
*/
/** Go from "cold" to "warm" including CheckpointAndChangeQueue. */
public void start(String connectorManagerCheckpoint)
throws RepositoryException { try {
//启动 获取Change(主要动作:从json格式队列文件加载monitorPoints和checkpointAndChangeList队列)
checkpointAndChangeQueue.start(connectorManagerCheckpoint);
} catch (IOException e) {
throw new RepositoryException("Failed starting CheckpointAndChangeQueue.",
e);
}
//MonitorCheckpoint容器
Map<String, MonitorCheckpoint> monitorPoints
= checkpointAndChangeQueue.getMonitorRestartPoints(); Map<String, SnapshotStore> snapshotStores = null; //加载monitorName与SnapshotStore映射容器
try {
snapshotStores =
recoverSnapshotStores(connectorManagerCheckpoint, monitorPoints); } catch (SnapshotStoreException e) {
throw new RepositoryException("Snapshot recovery failed.", e);
} catch (IOException e) {
throw new RepositoryException("Snapshot recovery failed.", e);
} catch (InterruptedException e) {
throw new RepositoryException("Snapshot recovery interrupted.", e);
} //启动监控线程
startMonitorThreads(snapshotStores, monitorPoints); isRunning = true;
}
在初始化每个仓库对象的快照存储对象SnapshotStore时,同时传入相关联的MonitorCheckPoint对象实例,必要时修复快照文件
/* For each start path gets its monitor recovery files in state were monitor
* can be started. */
/**
* 加载monitorName与SnapshotStore映射容器
* @param connectorManagerCheckpoint
* @param monitorPoints
* @return
* @throws IOException
* @throws SnapshotStoreException
* @throws InterruptedException
*/
private Map<String, SnapshotStore> recoverSnapshotStores(
String connectorManagerCheckpoint, Map<String,
MonitorCheckpoint> monitorPoints)
throws IOException, SnapshotStoreException, InterruptedException {
Map<String, SnapshotStore> snapshotStores =
new HashMap<String, SnapshotStore>();
for (SnapshotRepository<? extends DocumentSnapshot> repository
: repositories) {
String monitorName = makeMonitorNameFromStartPath(repository.getName());
File dir = new File(snapshotDir, monitorName); boolean startEmpty = (connectorManagerCheckpoint == null)
|| (!monitorPoints.containsKey(monitorName));
if (startEmpty) {
LOG.info("Deleting " + repository.getName()
+ " global checkpoint=" + connectorManagerCheckpoint
+ " monitor checkpoint=" + monitorPoints.get(monitorName));
//删除该快照目录
delete(dir);
} else {
//修复该快照目录
SnapshotStore.stitch(dir, monitorPoints.get(monitorName),
documentSnapshotFactory);
} SnapshotStore snapshotStore = new SnapshotStore(dir,
documentSnapshotFactory); snapshotStores.put(monitorName, snapshotStore);
}
return snapshotStores;
}
下面继续跟踪启动监控器线程的方法
/**
* 启动监控线程(貌似MonitorCheckpoint与SnapshotStore与monitor有映射关系)
* Creates a {@link DocumentSnapshotRepositoryMonitor} thread for each
* startPath.
*
* @throws RepositoryDocumentException if any of the threads cannot be
* started.
*/
private void startMonitorThreads(Map<String, SnapshotStore> snapshotStores,
Map<String, MonitorCheckpoint> monitorPoints)
throws RepositoryDocumentException { for (SnapshotRepository<? extends DocumentSnapshot> repository
: repositories) {
String monitorName = makeMonitorNameFromStartPath(repository.getName());
//monitorName snapshotStores映射
//快照存储器(读写器)
SnapshotStore snapshotStore = snapshotStores.get(monitorName);
//创建监控线程
Thread monitorThread = newMonitorThread(repository, snapshotStore,
monitorPoints.get(monitorName));
threads.add(monitorThread); LOG.info("starting monitor for <" + repository.getName() + ">");
monitorThread.setName(repository.getName());
monitorThread.setDaemon(true);
monitorThread.start();
}
}
监控器对象的创建在下面的方法
/**
* 创建监控线程
* Creates a {@link DocumentSnapshotRepositoryMonitor} thread for the provided
* folder.
*
* @throws RepositoryDocumentException if {@code startPath} is not readable,
* or if there is any problem reading or writing snapshots.
*/
private Thread newMonitorThread(
SnapshotRepository<? extends DocumentSnapshot> repository,
SnapshotStore snapshotStore, MonitorCheckpoint startCp)
throws RepositoryDocumentException {
//注意monitorName
String monitorName = makeMonitorNameFromStartPath(repository.getName());
//document在监控线程里面处理
DocumentSnapshotRepositoryMonitor monitor =
new DocumentSnapshotRepositoryMonitor(monitorName, repository,
snapshotStore, changeQueue.newCallback(), DOCUMENT_SINK, startCp,
documentSnapshotFactory);
monitor.setTraversalSchedule(traversalSchedule);
LOG.fine("Adding a new monitor for " + monitorName + ": " + monitor);
fileSystemMonitorsByName.put(monitorName, monitor);
return new Thread(monitor);
}
stop方法实现监控器线程的停止
/**
* 停止监控器
*/
private void flagAllMonitorsToStop() {
for (SnapshotRepository<? extends DocumentSnapshot> repository
: repositories) {
String monitorName = makeMonitorNameFromStartPath(repository.getName());
DocumentSnapshotRepositoryMonitor
monitor = fileSystemMonitorsByName.get(monitorName);
if (null != monitor) {
monitor.shutdown();
}
else {
LOG.fine("Unable to stop non existent monitor thread for "
+ monitorName);
}
}
}
/**
* 停止监控器线程
*/
/* @Override */
public synchronized void stop() {
for (Thread thread : threads) {
thread.interrupt();
}
for (Thread thread : threads) {
try {
thread.join(MAX_SHUTDOWN_MS);
if (thread.isAlive()) {
LOG.warning("failed to stop background thread: " + thread.getName());
}
} catch (InterruptedException e) {
// Mark this thread as interrupted so it can be dealt with later.
Thread.currentThread().interrupt();
}
}
threads.clear(); /* in case thread.interrupt doesn't stop monitors */
flagAllMonitorsToStop(); fileSystemMonitorsByName.clear();
changeQueue.clear();
this.isRunning = false;
}
在flagAllMonitorsToStop()方法中调用监控器对象的monitor.shutdown()方法,设置监控器对象 的标识属性
/* The monitor should exit voluntarily if set to false */
private volatile boolean isRunning = true;
---------------------------------------------------------------------------
本系列企业搜索引擎开发之连接器connector系本人原创
转载请注明出处 博客园 刺猬的温驯
本人邮箱: chenying998179@163#com (#改为.)
本文链接 http://www.cnblogs.com/chenying99/p/3789613.html
企业搜索引擎开发之连接器connector(二十八)的更多相关文章
- 企业搜索引擎开发之连接器connector(十八)
创建并启动连接器实例之后,连接器就会基于Http协议向指定的数据接收服务器发送xmlfeed格式数据,我们可以通过配置http代理服务器抓取当前基于http协议格式的数据(或者也可以通过其他网络抓包工 ...
- 企业搜索引擎开发之连接器connector(十九)
连接器是基于http协议通过推模式(push)向数据接收服务端推送数据,即xmlfeed格式数据(xml格式),其发送数据接口命名为Pusher Pusher接口定义了与发送数据相关的方法 publi ...
- 企业搜索引擎开发之连接器connector(十六)
本人有一段时间没有接触企业搜索引擎之连接器的开发了,连接器是涉及企业搜索引擎一个重要的组件,在数据源与企业搜索引擎中间起一个桥梁的作用,类似于数据库之JDBC,通过连接器将不同数据源的数据适配到企业搜 ...
- 企业搜索引擎开发之连接器connector(二十九)
在哪里调用监控器管理对象snapshotRepositoryMonitorManager的start方法及stop方法,然后又在哪里调用CheckpointAndChangeQueue对象的resum ...
- 企业搜索引擎开发之连接器connector(二十六)
连接器通过监视器对象DocumentSnapshotRepositoryMonitor从上文提到的仓库对象SnapshotRepository(数据库仓库为DBSnapshotRepository)中 ...
- 企业搜索引擎开发之连接器connector(二十五)
下面开始具体分析连接器是怎么与连接器实例交互的,这里主要是分析连接器怎么从连接器实例获取数据的(前面文章有涉及基于http协议与连接器的xml格式的交互,连接器对连接器实例的设置都是通过配置文件操作的 ...
- 企业搜索引擎开发之连接器connector(二十四)
本人在上文中提到,连接器实现了两种事件依赖的机制 ,其一是我们手动操作连接器实例时:其二是由连接器的自动更新机制 上文中分析了连接器的自动更新机制,即定时器执行定时任务 那么,如果我们手动操作连接器实 ...
- 企业搜索引擎开发之连接器connector(二十二)
下面来分析线程执行类,线程池ThreadPool类 对该类的理解需要对java的线程池比较熟悉 该类引用了一个内部类 /** * The lazily constructed LazyThreadPo ...
- 企业搜索引擎开发之连接器connector(二十)
连接器里面衔接数据源与数据推送对象的是QueryTraverser类对象,该类实现了Traverser接口 /** * Interface presented by a Traverser. Used ...
随机推荐
- vue通过(NGINX)部署在子目录或者二级目录实践
1.修改 router/index.js 添加一行 base: 'admin', 2.然后修改 config/index.js 增加一行 const assetsPublicPath = '/admi ...
- 微信小程序设置底部导航栏目方法
微信小程序底部想要有一个漂亮的导航栏目,不知道怎么制作,于是百度找到了本篇文章,分享给大家. 好了 小程序的头部标题 设置好了,我们来说说底部导航栏是如何实现的. 我们先来看个效果图 这里,我们添加了 ...
- 十三、jdk命令之Java内存之本地内存分析神器:NMT 和 pmap
目录 一.jdk工具之jps(JVM Process Status Tools)命令使用 二.jdk命令之javah命令(C Header and Stub File Generator) 三.jdk ...
- Linux常用系统函数
Linux常用系统函数 一.进程控制 fork 创建一个新进程clone 按指定条件创建子进程execve 运行可执行文件exit 中止进程_exit 立即中止当前进程getdtablesize 进程 ...
- httpclient的几种请求URL的方式
一.httpclient项目有两种使用方式.一种是commons项目,这一个就只更新到3.1版本了.现在挪到了HttpComponents子项目下了,这里重点讲解HttpComponents下面的ht ...
- 并发包学习(三)-AbstractQueuedSynchronizer总结
J.U.C学习的第二篇AQS.AQS在Java并发包中的重要性,毋庸置疑,所以单独拿出来理一理.本文参考总结自<Java并发编程的艺术>第五章第二节队列同步器. 什么是AbstractQu ...
- uwsgi配置文件的一些细节,uwsgi错误invalid request block size
[uwsgi] #socket = #这种是使用代理方式访问的,不能直接输入端口访问,要搭配其他的HTTP服务比如NGINX,设置反向代理 http =: #这种是直接可以输入IP端口访问 modul ...
- SystemColors 成员
名称 说明 ActiveBorder 获取 Color 结构,它是活动窗口边框的颜色. ActiveCaption 获取 Color 结构,它是活动窗口标题栏的背景色. ActiveCaptionTe ...
- android手机 ping 虚拟机ubuntu的ip地址
今天使用android手机往虚拟机上ubuntu 上搭建的nginx 和rtmp服务器推送东西的时候,怎么都推不上去. 后来在windows下的cmd里: # adb shell # ping 192 ...
- JS在生成csv文件时,","逗号问题处理.
在生成csv文件时,发现一个问题,因为csv文件本身是依靠逗号进行分列的,所以内容中有逗号时也被强制分列了,处理方法很简单,为内容加上双引号(英文格式)就可以了. 如: "11111,222 ...