Solr Related 24 Mar 2014

范围查询

使用{!frange l=0 u=80 incl=false incu=false}myField来进行范围查询 l和u必须为数字 incl为是否包含lower位 incu为是否包含upper位 myField是被查询的field 如果需要添加统计排除标识{!ex=dt},使用以下写法 {!frange l=0 u=80 incl=false incu=false ex=dt}myField

SolrCloud from release 4.4

Solr是一个基于J2EE的项目,所以首先查看WEB-INF/web.xml文件,可以看到solr注册了一个名为SolrRequestFilter的Filter,它指向了SolrDispatchFilter

SolrCloud相关代码分析

web.xml

<!-- Any path (name) registered in solrconfig.xml will be sent to that filter -->
  <filter>
    <filter-name>SolrRequestFilter</filter-name>
    <filter-class>org.apache.solr.servlet.SolrDispatchFilter</filter-class>
  </filter>
  <filter-mapping>
    <filter-name>SolrRequestFilter</filter-name>
    <url-pattern>/*</url-pattern>
  </filter-mapping>

在J2EE容器启动时,会调用SolrRequestFilterinit方法,该方法会帮助solr创建一个CoreContainer实例,并调用它的load方法

@Override
public void init(FilterConfig config) throws ServletException {
    log.info("SolrDispatchFilter.init()");
    try {
      // web.xml configuration
      this.pathPrefix = config.getInitParameter( "path-prefix" );
      this.cores = createCoreContainer();                    // 创建Cores Container
      log.info("user.dir=" + System.getProperty("user.dir"));
    }
    catch( Throwable t ) {
      // catch this so our filter still works
      log.error( "Could not start Solr. Check solr/home property and the logs");
      SolrCore.log( t );
    }
    log.info("SolrDispatchFilter.init() done");
}
 
// ...
 
protected CoreContainer createCoreContainer() {
    CoreContainer cores = new CoreContainer();
    cores.load();
    return cores;
}

再看CoreContainerload方法

public void load()  {
    // ...
    zkHost = cfg.get(ConfigSolr.CfgProp.SOLR_ZKHOST, null);    // zookeeper service的host(含端口)
    zkClientTimeout = cfg.getInt(ConfigSolr.CfgProp.SOLR_ZKCLIENTTIMEOUT, DEFAULT_ZK_CLIENT_TIMEOUT);  // 连接zookeeper service 的超时设置
    distribUpdateConnTimeout = cfg.getInt(ConfigSolr.CfgProp.SOLR_DISTRIBUPDATECONNTIMEOUT, 0);
    distribUpdateSoTimeout = cfg.getInt(ConfigSolr.CfgProp.SOLR_DISTRIBUPDATESOTIMEOUT, 0);
    // Note: initZooKeeper will apply hardcoded default if cloud mode
    String hostPort = cfg.get(ConfigSolr.CfgProp.SOLR_HOSTPORT, null);
    // Note: initZooKeeper will apply hardcoded default if cloud mode
    String hostContext = cfg.get(ConfigSolr.CfgProp.SOLR_HOSTCONTEXT, null);
    String host = cfg.get(ConfigSolr.CfgProp.SOLR_HOST, null);
    String leaderVoteWait = cfg.get(ConfigSolr.CfgProp.SOLR_LEADERVOTEWAIT, LEADER_VOTE_WAIT);
    zkClientTimeout = Integer.parseInt(System.getProperty("zkClientTimeout",
                                            Integer.toString(zkClientTimeout)));
    // 以上是从solr.xml中获取的一些zookeeper相关的配置信息
     
    zkSys.initZooKeeper(this, solrHome, zkHost, zkClientTimeout, hostPort,
                                hostContext, host, leaderVoteWait, genericCoreNodeNames,
                                distribUpdateConnTimeout, distribUpdateSoTimeout);
     
    if (isZooKeeperAware() && coreLoadThreads <= 1) {
      throw new SolrException(ErrorCode.SERVER_ERROR,
          "SolrCloud requires a value of at least 2 in solr.xml for coreLoadThreads");
    }
     
    // 下列代码省略,在后续分析
 }

其中zkSys.initZooKeeper帮助solr初始化了zookeeper

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
public void initZooKeeper(final CoreContainer cc, String solrHome, String zkHost, int zkClientTimeout,
                             String hostPort, String hostContext, String host, String leaderVoteWait,
                                boolean genericCoreNodeNames, int distribUpdateConnTimeout, int distribUpdateSoTimeout) {
    ZkController zkController = null;
     
    // if zkHost sys property is not set, we are not using ZooKeeper
    String zookeeperHost;
    if(zkHost == null) {
      zookeeperHost = System.getProperty("zkHost");
    } else {
      zookeeperHost = zkHost;
    }
 
 
    String zkRun = System.getProperty("zkRun");  // 是否运行zookeeper服务
     
    this.zkClientTimeout = zkClientTimeout;
    this.hostPort = hostPort;
    this.hostContext = hostContext;
    this.host = host;
    this.leaderVoteWait = leaderVoteWait;
    this.genericCoreNodeNames = genericCoreNodeNames;
    this.distribUpdateConnTimeout = distribUpdateConnTimeout;
    this.distribUpdateSoTimeout = distribUpdateSoTimeout;
     
    if (zkRun == null && zookeeperHost == null)
        return;  // not in zk mode
    // BEGIN: SOLR-4622: deprecated hardcoded defaults for hostPort & hostContext
    if (null == hostPort) {
      log.warn("Solr 'hostPort' has not be explicitly configured, using hardcoded default of " + DEFAULT_HOST_PORT + ".  This default has been deprecated and will be removed in future versions of Solr, please configure this value explicitly");
      hostPort = DEFAULT_HOST_PORT;
    }
    if (null == hostContext) {
      log.warn("Solr 'hostContext' has not be explicitly configured, using hardcoded default of " + DEFAULT_HOST_CONTEXT + ".  This default has been deprecated and will be removed in future versions of Solr, please configure this value explicitly");
      hostContext = DEFAULT_HOST_CONTEXT;
    }
    // END: SOLR-4622
 
    // zookeeper in quorum mode currently causes a failure when trying to
    // register log4j mbeans.  See SOLR-2369
    // TODO: remove after updating to an slf4j based zookeeper
    System.setProperty("zookeeper.jmx.log4j.disable", "true");
 
    if (zkRun != null) {
      String zkDataHome = System.getProperty("zkServerDataDir", solrHome + "zoo_data");  //设置zookeeper数据文件路径
      String zkConfHome = System.getProperty("zkServerConfDir", solrHome);         // zookeeper配置
      zkServer = new SolrZkServer(zkRun, zookeeperHost, zkDataHome, zkConfHome, hostPort);
      zkServer.parseConfig();
      zkServer.start();  // 开启本地zookeeper service
       
      // set client from server config if not already set
      if (zookeeperHost == null) {
        zookeeperHost = zkServer.getClientString();
      }
    }
 
    int zkClientConnectTimeout = 15000;
 
    if (zookeeperHost != null) {
      // we are ZooKeeper enabled
      try {
        // If this is an ensemble, allow for a long connect time for other servers to come up
        if (zkRun != null && zkServer.getServers().size() > 1) {
          zkClientConnectTimeout = 24 * 60 * 60 * 1000;  // 1 day for embedded ensemble
          log.info("Zookeeper client=" + zookeeperHost + "  Waiting for a quorum.");
        } else {
          log.info("Zookeeper client=" + zookeeperHost);         
        }
        String confDir = System.getProperty("bootstrap_confdir");  // 获取solr配置文件目录
        boolean boostrapConf = Boolean.getBoolean("bootstrap_conf"); 
         
        if(!ZkController.checkChrootPath(zookeeperHost, (confDir!=null) || boostrapConf)) {
          throw new ZooKeeperException(SolrException.ErrorCode.SERVER_ERROR,
              "A chroot was specified in ZkHost but the znode doesn't exist. ");
        }
        zkController = new ZkController(cc, zookeeperHost, zkClientTimeout,
            zkClientConnectTimeout, host, hostPort, hostContext,
            leaderVoteWait, genericCoreNodeNames, distribUpdateConnTimeout, distribUpdateSoTimeout,
            new CurrentCoreDescriptorProvider() {
              @Override
              public List<CoreDescriptor> getCurrentDescriptors() {
                List<CoreDescriptor> descriptors = new ArrayList<CoreDescriptor>(
                    cc.getCoreNames().size());
                Collection<SolrCore> cores = cc.getCores();
                for (SolrCore core : cores) {
                  descriptors.add(core.getCoreDescriptor());
                }
                return descriptors;
              }
            });
 
        if (zkRun != null && zkServer.getServers().size() > 1 && confDir == null && boostrapConf == false) {
          // we are part of an ensemble and we are not uploading the config - pause to give the config time
          // to get up
          Thread.sleep(10000);
        }
         
        if(confDir != null) {
          File dir = new File(confDir);
          if(!dir.isDirectory()) {
            throw new IllegalArgumentException("bootstrap_confdir must be a directory of configuration files");
          }
          String confName = System.getProperty(ZkController.COLLECTION_PARAM_PREFIX+ZkController.CONFIGNAME_PROP,
                             "configuration1");
          zkController.uploadConfigDir(dir, confName);  // 指定配置文件存放目录,并将其上传到zookeeper
        }
         
        if(boostrapConf) {
          ZkController.bootstrapConf(zkController.getZkClient(), cc.cfg, solrHome);  // 找到solr home下的solr.xml, 遍历每个core,并将core下conf目录上传到zookeeper
        }
         
      } catch (InterruptedException e) {
        // Restore the interrupted status
        Thread.currentThread().interrupt();
        log.error("", e);
        throw new ZooKeeperException(SolrException.ErrorCode.SERVER_ERROR,
            "", e);
      } catch (TimeoutException e) {
        log.error("Could not connect to ZooKeeper", e);
        throw new ZooKeeperException(SolrException.ErrorCode.SERVER_ERROR,
            "", e);
      } catch (IOException e) {
        log.error("", e);
        throw new ZooKeeperException(SolrException.ErrorCode.SERVER_ERROR,
            "", e);
      } catch (KeeperException e) {
        log.error("", e);
        throw new ZooKeeperException(SolrException.ErrorCode.SERVER_ERROR,
            "", e);
      }
    }
    this.zkController = zkController;
  }

line49,创建了一个SolrZkServer的实例,并通过parseConfig方法获取zoo.cfg文件的配置,start方法新建一个线程,启动zookeeper service,下面看SolrZkServer的start方法

public void start() {
    if (zkRun == null) return;
    zkThread = new Thread() {
      @Override
      public void run() {
        try {
          if (zkProps.getServers().size() > 1) {
            QuorumPeerMain zkServer = new QuorumPeerMain();       // 如果zookeeper server数为多个,使用QuorumPeerMain类开启zookeeper集群模式
            zkServer.runFromConfig(zkProps);
          } else {
            ServerConfig sc = new ServerConfig();
            sc.readFrom(zkProps);
            ZooKeeperServerMain zkServer = new ZooKeeperServerMain();  // 如果zookeeper server数为一个,使用ZooKeeperServerMain类开启单机模式
            zkServer.runFromConfig(sc);
          }
          log.info("ZooKeeper Server exited.");
        } catch (Throwable e) {
          log.error("ZooKeeper Server ERROR", e);
          throw new SolrException(SolrException.ErrorCode.SERVER_ERROR, e);
        }
      }
    };
 
    if (zkProps.getServers().size() > 1) {
      log.info("STARTING EMBEDDED ENSEMBLE ZOOKEEPER SERVER at port " + zkProps.getClientPortAddress().getPort());
    } else {
      log.info("STARTING EMBEDDED STANDALONE ZOOKEEPER SERVER at port " + zkProps.getClientPortAddress().getPort());
    }
 
    zkThread.setDaemon(true);
    zkThread.start();
    try {
      Thread.sleep(500); // pause for ZooKeeper to start
    } catch (Exception e) {
      log.error("STARTING ZOOKEEPER", e);
    }
  }

至此,zookeeper服务初始化完毕.

SolrCloud中相关配置:

配置名默认值描述是否采用
numShardsDefaults to 1The number of shards to hash documents to. There will be one leader per shard and each leader can have N replicas.
SolrCloud Instance Params
hostDefaults to the first local host address found If the wrong host address is found automatically, you can over ride the host address with this param. System
hostPortDefaults to the jetty.port system propertyThe port that Solr is running on - by default this is found by looking at the jetty.port system property. System
hostContextDefaults to solr The context path for the Solr webapp. (Note: in Solr 4.0, it was mandatory that the hostContext not contain "/" or "_" characters. Begining with Solr 4.1, this limitation was removed, and it is recomended that you specify the begining slash. When running in the example jetty configs, the "hostContext" system property can be used to control both the servlet context used by jetty, and the hostContext used by SolrCloud -- eg: -DhostContext=/solr)
SolrCloud Zookeeper Instance Params
zkRunDefaults to localhost:<solrPort+1001>Causes Solr to run an embedded version of ZooKeeper. Set to the address of ZooKeeper on this node - this allows us to know who 'we are' in the list of addresses in the zkHost connect string. Simply using -DzkRun gets you the default value. Note this must be one of the exact strings from zkHost; in particular, the default localhost will not work for a multi-machine ensemble. System
zkHostNo default The host address for ZooKeeper - usually this should be a comma separated list of addresses to each node in your ZooKeeper ensemble.System+File
zkClientTimeoutDefaults to 15000The time a client is allowed to not talk to ZooKeeper before having it's session expired.System+File
SolrCloud Core Params
shardThe shard id. Defaults to being automatically assigned based on numShards Allows you to specify the id used to group SolrCores into shards.
Config Startup Bootstrap Params
bootstrap_confNo default If you pass -Dbootstrap_conf=true on startup, each SolrCore you have configured will have it's configuration files automatically uploaded and linked to the collection that SolrCore is part of
bootstrap_confdirNo default If you pass -bootstrap_confdir=<directory> on startup, that specific directory of configuration files will be uploaded to ZooKeeper with a 'conf set' name defined by the below system property, collection.configNameSystem
collection.configNameDefaults to configuration1 Determines the name of the conf set pointed to by bootstrap_confdir