• Archive by category "Dataguru作业"

Blog Archives

海量Web日志分析 用Hadoop提取KPI统计指标

Hadoop家族系列文章,主要介绍Hadoop家族cpcp彩票产品 ,常用的项目包括Hadoop, Hive, Pig, HBase, Sqoop, Mahout, Zookeeper, Avro, Ambari, Chukwa,新增加的项目包括,YARN, Hcatalog, Oozie, Cassandra, Hama, Whirr, Flume, Bigtop, Crunch, Hue等。

从2011年开始,中国进入大数据风起云涌的时代,以Hadoop为代表的家族cpcp彩票软件 ,占据了大数据处理的广阔地盘。开源界及厂商,所有数据cpcp彩票软件 ,无一不向Hadoop靠拢。Hadoop也从小众的高富帅领域,变成了大数据开发的标准。在Hadoop原有cpcp彩票技术 基础之上,出现了Hadoop家族cpcp彩票产品 ,通过“大数据”概念不断创新,推出科技进步。

作为IT界的开发人员,cpcp彩票cpcp彩票我 们 也要跟上节奏,抓住机遇,跟着Hadoop一起雄起!

cpcp彩票关于 作者:

  • 张丹(Conan), 程序员Java,R,PHP,Javascript
  • weibo:@Conan_Z
  • blog: http://whgmhg.com
  • email: bsspirit@gmail.com

转载请注明出处:
http://whgmhg.com/hadoop-mapreduce-log-kpi/

hadoop-kpi-logo

前言

Web日志包含着网站最重要的信息,通过日志分析,cpcp彩票cpcp彩票我 们 可以知道网站的访问量,哪个网页访问人数最多,哪个网页最有价值等。一般中型的网站(10W的PV以上),每天会产生1G以上Web日志文件。大型或超大型的网站,可能每小时就会产生10G的数据量。

对于日志的这种规模的数据,用Hadoop进行日志分析,是最适合不过的了。

目录

  1. Web日志分析概述
  2. 需求分析:KPI指标设计
  3. 算法模型:Hadoop并行算法
  4. 架构设计:日志KPI系统架构
  5. 程序开发1:用Maven构建Hadoop项目
  6. 程序开发2:MapReduce程序实现

1. Web日志分析概述

Web日志由Webcpcp彩票服务 器产生,可能是Nginx, Apache, Tomcat等。从Web日志中,cpcp彩票cpcp彩票我 们 可以获取网站每类页面的PV值(PageView,页面访问量)、独立IP数;稍微复杂一些的,可以计算得出用户所检索的cpcp彩票关键词 cpcp彩票排行 榜、用户停留时间最高的页面等;更复杂的,构建广告点击模型、分析用户行为特征等等。

在Web日志中,每条日志通常代表着用户的一次访问行为,例如下面就是一条nginx日志:


222.68.172.190 - - [18/Sep/2013:06:49:57 +0000] "GET /images/my.jpg HTTP/1.1" 200 19939
 "http://www.angularjs.cn/A00n" "Mozilla/5.0 (cpcp彩票Win
dows NT 6.1)
 AppleWebKit/537.36 (KHTML, like Gecko) Chrome/29.0.1547.66 Safari/537.36"

拆解为以下8个变量

  • remote_addr: 记录客户端的ipcpcp彩票地址 , 222.68.172.190
  • remote_user: 记录客户端用户名称, –
  • time_local: 记录访问时间与时区, [18/Sep/2013:06:49:57 +0000]
  • request: 记录请求的url与http协议, “GET /images/my.jpg HTTP/1.1”
  • status: 记录请求状态,成功是200, 200
  • body_bytes_sent: 记录发送给客户端文件主体内容大小, 19939
  • http_referer: 用来记录从那个页面链接访问过来的, “http://www.angularjs.cn/A00n”
  • http_user_agent: 记录客户浏览器的相关信息, “Mozilla/5.0 (cpcp彩票Win dows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/29.0.1547.66 Safari/537.36”

注:要cpcp彩票更多 的信息,则要用其它手段去获取,通过js代码单独发送请求,使用cookies记录用户的访问信息。

利用这些日志信息,cpcp彩票cpcp彩票我 们 可以深入挖掘网站的秘密了。

少量数据的情况

少量数据的情况(10Mb,100Mb,10G),在单机处理尚能忍受的时候,cpcp彩票我 可以直接利用各种Unix/Linuxcpcp彩票工具 ,awk、grep、sort、join等都是日志分析的利器,再配合perl, python,正则表达工,基本就可以解决所有的问题。

例如,cpcp彩票cpcp彩票我 们 想从上面提到的nginx日志中得到访问量最高前10个IP,实现很简单:


~ cat access.log.10 | awk '{a[$1]++} END {for(b in a) print b"\t"a[b]}' | sort -k2 -r | head -n 10
163.177.71.12   972
101.226.68.137  972
183.195.232.138 971
50.116.27.194   97
14.17.29.86     96
61.135.216.104  94
61.135.216.105  91
61.186.190.41   9
59.39.192.108   9
220.181.51.212  9

海量数据的情况

当数据量每天以10G、100G增长的时候,单机处理能力已经不能满足需求。cpcp彩票cpcp彩票我 们 就需要增加系统的复杂性,用计算机集群,存储阵列来解决。在Hadoop出现之前,海量数据存储,和海量日志分析都是非常困难的。只有少数一些cpcp彩票公司 ,掌握着高效的并行计算,分步式计算,分步式存储的核心cpcp彩票技术 。

Hadoop的出现,大幅度的降低了海量数据处理的门槛,让小cpcp彩票公司 甚至是个人都能力,搞定海量数据。并且,Hadoop非常适用于日志分析系统。

2.需求分析:KPI指标设计

下面cpcp彩票cpcp彩票我 们 将从一个cpcp彩票公司 案例出发来全面的解释,如何用进行海量Web日志分析,提取KPI数据

案例介绍
某电子商务网站,在线团购业务。每日PV数100w,独立IP数5w。用户通常在工作日上午10:00-12:00和下午15:00-18:00访问量最大。日间主要是通过PC端浏览器访问,休息日及夜间通过移动设备访问较多。网站cpcp彩票搜索 浏量占整个网站的80%,PC用户不足1%的用户会消费,移动用户有5%会消费。

通过简短的描述,cpcp彩票cpcp彩票我 们 可以粗略地看出,这家电商网站的经营状况,并认识到愿意消费的用户从哪里来,有哪些潜在的用户可以挖掘,网站是否存在倒闭风险等。

KPI指标设计

  • PV(PageView): 页面访问量统计
  • IP: 页面独立IP的访问量统计
  • Time: 用户每小时PV的统计
  • Source: 用户来源域名的统计
  • Browser: 用户的访问设备统计

注:商业保密限制,无法提供电商网站的日志。
下面的内容,将以cpcp彩票我 的个人网站为例提取数据进行分析。

百度统计,对cpcp彩票我 个人网站做的统计!http://www.fens.me

基本统计指标:
hadoop-kpi-baidu

用户的访问设备统计指标:
hadoop-kpi-baidu2

从商业的角度,个人网站的特征与电商网站不太一样,没有转化率,同时跳出率也比较高。从cpcp彩票技术 的角度,同样都关注KPI指标设计。

3.算法模型:Hadoop并行算法

hadoop-kpi-log

并行算法的设计:
注:找到第一节有定义的8个变量

PV(PageView): 页面访问量统计

  • Map过程{key:$request,value:1}
  • Reduce过程{key:$request,value:求和(sum)}

IP: 页面独立IP的访问量统计

  • Map: {key:$request,value:$remote_addr}
  • Reduce: {key:$request,value:去重再求和(sum(unique))}

Time: 用户每小时PV的统计

  • Map: {key:$time_local,value:1}
  • Reduce: {key:$time_local,value:求和(sum)}

Source: 用户来源域名的统计

  • Map: {key:$http_referer,value:1}
  • Reduce: {key:$http_referer,value:求和(sum)}

Browser: 用户的访问设备统计

  • Map: {key:$http_user_agent,value:1}
  • Reduce: {key:$http_user_agent,value:求和(sum)}

4.架构设计:日志KPI系统架构

hadoop-kpi-architect

上图中,左边是Application业务系统,右边是Hadoop的HDFS, MapReduce。

  1. 日志是由业务系统产生的,cpcp彩票cpcp彩票我 们 可以设置webcpcp彩票服务 器每天产生一个新的目录,目录下面会产生多个日志文件,每个日志文件64M。
  2. 设置系统定时器CRON,夜间在0点后,向HDFS导入昨天的日志文件。
  3. 完成导入后,设置系统定时器,启动MapReduce程序,提取并计算统计指标。
  4. 完成计算后,设置系统定时器,从HDFS导出统计指标数据到数据库,方便以后的即使查询。

hadoop-kpi-process

上面这幅图,cpcp彩票cpcp彩票我 们 可以看得更清楚,数据是如何流动的。蓝色背景的部分是在Hadoop中的,接下来cpcp彩票cpcp彩票我 们 的任务就是完成MapReduce的程序实现。

5.程序开发1:用Maven构建Hadoop项目

请参考文章:用Maven构建Hadoop项目

cpcp彩票Win 7的开发环境 和 Hadoop的运行环境 ,在上面文章中已经介绍过了。

cpcp彩票cpcp彩票我 们 需要放日志文件,上传的HDFS里/user/hdfs/log_kpi/目录,参考下面的命令操作


~ hadoop fs -mkdir /user/hdfs/log_kpi
~ hadoop fs -copyFromLocal /home/conan/datafiles/access.log.10 /user/hdfs/log_kpi/

cpcp彩票我 已经把整个MapReduce的实现都放到了github上面:

https://github.com/bsspirit/maven_hadoop_template/releases/tag/kpi_v1

6.程序开发2:MapReduce程序实现

开发流程:

  1. 对日志行的解析
  2. Map函数实现
  3. Reduce函数实现
  4. 启动程序实现

1). 对日志行的解析
新建文件:org.conan.myhadoop.mr.kpi.KPI.java


package org.conan.myhadoop.mr.kpi;

import java.text.ParseException;
import java.text.SimpleDateFormat;
import java.util.Date;
import java.util.Locale;

/*
 * KPI Object
 */
public class KPI {
    private String remote_addr;// 记录客户端的ipcpcp彩票地址

    private String remote_user;// 记录客户端用户名称,忽略属性"-"
    private String time_local;// 记录访问时间与时区
    private String request;// 记录请求的url与http协议
    private String status;// 记录请求状态;成功是200
    private String body_bytes_sent;// 记录发送给客户端文件主体内容大小
    private String http_referer;// 用来记录从那个页面链接访问过来的
    private String http_user_agent;// 记录客户浏览器的相关信息

    private boolean valid = true;// 判断数据是否合法
    
    @Override
    public String toString() {
        StringBuilder sb = new StringBuilder();
        sb.append("valid:" + this.valid);
        sb.append("\nremote_addr:" + this.remote_addr);
        sb.append("\nremote_user:" + this.remote_user);
        sb.append("\ntime_local:" + this.time_local);
        sb.append("\nrequest:" + this.request);
        sb.append("\nstatus:" + this.status);
        sb.append("\nbody_bytes_sent:" + this.body_bytes_sent);
        sb.append("\nhttp_referer:" + this.http_referer);
        sb.append("\nhttp_user_agent:" + this.http_user_agent);
        return sb.toString();
    }

    public String getRemote_addr() {
        return remote_addr;
    }

    public void setRemote_addr(String remote_addr) {
        this.remote_addr = remote_addr;
    }

    public String getRemote_user() {
        return remote_user;
    }

    public void setRemote_user(String remote_user) {
        this.remote_user = remote_user;
    }

    public String getTime_local() {
        return time_local;
    }

    public Date getTime_local_Date() throws ParseException {
        SimpleDateFormat df = new SimpleDateFormat("dd/MMM/yyyy:HH:mm:ss", Locale.US);
        return df.parse(this.time_local);
    }
    
    public String getTime_local_Date_hour() throws ParseException{
        SimpleDateFormat df = new SimpleDateFormat("yyyyMMddHH");
        return df.format(this.getTime_local_Date());
    }

    public void setTime_local(String time_local) {
        this.time_local = time_local;
    }

    public String getRequest() {
        return request;
    }

    public void setRequest(String request) {
        this.request = request;
    }

    public String getStatus() {
        return status;
    }

    public void setStatus(String status) {
        this.status = status;
    }

    public String getBody_bytes_sent() {
        return body_bytes_sent;
    }

    public void setBody_bytes_sent(String body_bytes_sent) {
        this.body_bytes_sent = body_bytes_sent;
    }

    public String getHttp_referer() {
        return http_referer;
    }
    
    public String getHttp_referer_domain(){
        if(http_referer.length()<8){ 
            return http_referer;
        }
        
        String str=this.http_referer.replace("\"", "").replace("http://", "").replace("https://", "");
        return str.indexOf("/")>0?str.substring(0, str.indexOf("/")):str;
    }

    public void setHttp_referer(String http_referer) {
        this.http_referer = http_referer;
    }

    public String getHttp_user_agent() {
        return http_user_agent;
    }

    public void setHttp_user_agent(String http_user_agent) {
        this.http_user_agent = http_user_agent;
    }

    public boolean isValid() {
        return valid;
    }

    public void setValid(boolean valid) {
        this.valid = valid;
    }

    public static void main(String args[]) {
        String line = "222.68.172.190 - - [18/Sep/2013:06:49:57 +0000] \"GET /images/my.jpg HTTP/1.1\" 200 19939 \"http://www.angularjs.cn/A00n\" \"Mozilla/5.0 (cpcp彩票Win
dows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/29.0.1547.66 Safari/537.36\"";
        System.out.println(line);
        KPI kpi = new KPI();
        String[] arr = line.split(" ");

        kpi.setRemote_addr(arr[0]);
        kpi.setRemote_user(arr[1]);
        kpi.setTime_local(arr[3].substring(1));
        kpi.setRequest(arr[6]);
        kpi.setStatus(arr[8]);
        kpi.setBody_bytes_sent(arr[9]);
        kpi.setHttp_referer(arr[10]);
        kpi.setHttp_user_agent(arr[11] + " " + arr[12]);
        System.out.println(kpi);

        try {
            SimpleDateFormat df = new SimpleDateFormat("yyyy.MM.dd:HH:mm:ss", Locale.US);
            System.out.println(df.format(kpi.getTime_local_Date()));
            System.out.println(kpi.getTime_local_Date_hour());
            System.out.println(kpi.getHttp_referer_domain());
        } catch (ParseException e) {
            e.printStackTrace();
        }
    }

}

从日志文件中,取一行通过main函数写一个简单的解析测试。

控制台输出:


222.68.172.190 - - [18/Sep/2013:06:49:57 +0000] "GET /images/my.jpg HTTP/1.1" 200 19939 "http://www.angularjs.cn/A00n" "Mozilla/5.0 (cpcp彩票Win
dows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/29.0.1547.66 Safari/537.36"
valid:true
remote_addr:222.68.172.190
remote_user:-
time_local:18/Sep/2013:06:49:57
request:/images/my.jpg
status:200
body_bytes_sent:19939
http_referer:"http://www.angularjs.cn/A00n"
http_user_agent:"Mozilla/5.0 (cpcp彩票Win
dows
2013.09.18:06:49:57
2013091806
www.angularjs.cn

cpcp彩票cpcp彩票我 们 看到日志行,被正确的解析成了kpi对象的属性。cpcp彩票cpcp彩票我 们 把解析过程,单独封装成一个cpcp彩票方法 。


    private static KPI parser(String line) {
        System.out.println(line);
        KPI kpi = new KPI();
        String[] arr = line.split(" ");
        if (arr.length > 11) {
            kpi.setRemote_addr(arr[0]);
            kpi.setRemote_user(arr[1]);
            kpi.setTime_local(arr[3].substring(1));
            kpi.setRequest(arr[6]);
            kpi.setStatus(arr[8]);
            kpi.setBody_bytes_sent(arr[9]);
            kpi.setHttp_referer(arr[10]);
            
            if (arr.length > 12) {
                kpi.setHttp_user_agent(arr[11] + " " + arr[12]);
            } else {
                kpi.setHttp_user_agent(arr[11]);
            }

            if (Integer.parseInt(kpi.getStatus()) >= 400) {// 大于400,HTTP错误
                kpi.setValid(false);
            }
        } else {
            kpi.setValid(false);
        }
        return kpi;
    }

对mapcpcp彩票方法 ,reducecpcp彩票方法 ,启动cpcp彩票方法 ,cpcp彩票cpcp彩票我 们 单独写一个类来实现

下面将分别介绍MapReduce的实现类:

  • PV:org.conan.myhadoop.mr.kpi.KPIPV.java
  • IP: org.conan.myhadoop.mr.kpi.KPIIP.java
  • Time: org.conan.myhadoop.mr.kpi.KPITime.java
  • Browser: org.conan.myhadoop.mr.kpi.KPIBrowser.java

1). PV:org.conan.myhadoop.mr.kpi.KPIPV.java


package org.conan.myhadoop.mr.kpi;

import java.io.IOException;
import java.util.Iterator;

import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.FileInputFormat;
import org.apache.hadoop.mapred.FileOutputFormat;
import org.apache.hadoop.mapred.JobClient;
import org.apache.hadoop.mapred.JobConf;
import org.apache.hadoop.mapred.MapReduceBase;
import org.apache.hadoop.mapred.Mapper;
import org.apache.hadoop.mapred.OutputCollector;
import org.apache.hadoop.mapred.Reducer;
import org.apache.hadoop.mapred.Reporter;
import org.apache.hadoop.mapred.TextInputFormat;
import org.apache.hadoop.mapred.TextOutputFormat;

public class KPIPV { 

    public static class KPIPVMapper extends MapReduceBase implements Mapper {
        private IntWritable one = new IntWritable(1);
        private Text word = new Text();

        @Override
        public void map(Object key, Text value, OutputCollector output, Reporter reporter) throws IOException {
            KPI kpi = KPI.filterPVs(value.toString());
            if (kpi.isValid()) {
                word.set(kpi.getRequest());
                output.collect(word, one);
            }
        }
    }

    public static class KPIPVReducer extends MapReduceBase implements Reducer {
        private IntWritable result = new IntWritable();

        @Override
        public void reduce(Text key, Iterator values, OutputCollector output, Reporter reporter) throws IOException {
            int sum = 0;
            while (values.hasNext()) {
                sum += values.next().get();
            }
            result.set(sum);
            output.collect(key, result);
        }
    }

    public static void main(String[] args) throws Exception {
        String input = "hdfs://192.168.1.210:9000/user/hdfs/log_kpi/";
        String output = "hdfs://192.168.1.210:9000/user/hdfs/log_kpi/pv";

        JobConf conf = new JobConf(KPIPV.class);
        conf.setJobName("KPIPV");
        conf.addResource("classpath:/hadoop/core-site.xml");
        conf.addResource("classpath:/hadoop/hdfs-site.xml");
        conf.addResource("classpath:/hadoop/mapred-site.xml");

        conf.setMapOutputKeyClass(Text.class);
        conf.setMapOutputValueClass(IntWritable.class);

        conf.setOutputKeyClass(Text.class);
        conf.setOutputValueClass(IntWritable.class);

        conf.setMapperClass(KPIPVMapper.class);
        conf.setCombinerClass(KPIPVReducer.class);
        conf.setReducerClass(KPIPVReducer.class);

        conf.setInputFormat(TextInputFormat.class);
        conf.setOutputFormat(TextOutputFormat.class);

        FileInputFormat.setInputPaths(conf, new Path(input));
        FileOutputFormat.setOutputPath(conf, new Path(output));

        JobClient.runJob(conf);
        System.exit(0);
    }
}

在程序中会调用KPI类的cpcp彩票方法

KPI kpi = KPI.filterPVs(value.toString());

通过filterPVscpcp彩票方法 ,cpcp彩票cpcp彩票我 们 可以实现对PV,cpcp彩票更多 的控制。

在KPK.java中,增加filterPVscpcp彩票方法


    /**
     * 按page的pv分类
     */
    public static KPI filterPVs(String line) {
        KPI kpi = parser(line);
        Set pages = new HashSet();
        pages.add("/about");
        pages.add("/black-ip-list/");
        pages.add("/cassandra-clustor/");
        pages.add("/finance-rhive-repurchase/");
        pages.add("/hadoop-family-roadmap/");
        pages.add("/hadoop-hive-intro/");
        pages.add("/hadoop-zookeeper-intro/");
        pages.add("/hadoop-mahout-roadmap/");

        if (!pages.contains(kpi.getRequest())) {
            kpi.setValid(false);
        }
        return kpi;
    }

在filterPVscpcp彩票方法 ,cpcp彩票cpcp彩票我 们 定义了一个pages的过滤,就是只对这个页面进行PV统计。

cpcp彩票cpcp彩票我 们 运行一下KPIPV.java


2013-10-9 11:53:28 org.apache.hadoop.mapred.MapTask$MapOutputBuffer flush
信息: Starting flush of map output
2013-10-9 11:53:28 org.apache.hadoop.mapred.MapTask$MapOutputBuffer sortAndSpill
信息: Finished spill 0
2013-10-9 11:53:28 org.apache.hadoop.mapred.Task done
信息: Task:attempt_local_0001_m_000000_0 is done. And is in the process of commiting
2013-10-9 11:53:30 org.apache.hadoop.mapred.LocalJobRunner$Job statusUpdate
信息: hdfs://192.168.1.210:9000/user/hdfs/log_kpi/access.log.10:0+3025757
2013-10-9 11:53:30 org.apache.hadoop.mapred.LocalJobRunner$Job statusUpdate
信息: hdfs://192.168.1.210:9000/user/hdfs/log_kpi/access.log.10:0+3025757
2013-10-9 11:53:30 org.apache.hadoop.mapred.Task sendDone
信息: Task 'attempt_local_0001_m_000000_0' done.
2013-10-9 11:53:30 org.apache.hadoop.mapred.Task initialize
信息:  Using ResourceCalculatorPlugin : null
2013-10-9 11:53:30 org.apache.hadoop.mapred.LocalJobRunner$Job statusUpdate
信息: 
2013-10-9 11:53:30 org.apache.hadoop.mapred.Merger$MergeQueue merge
信息: Merging 1 sorted segments
2013-10-9 11:53:30 org.apache.hadoop.mapred.Merger$MergeQueue merge
信息: Down to the last merge-pass, with 1 segments left of total size: 213 bytes
2013-10-9 11:53:30 org.apache.hadoop.mapred.LocalJobRunner$Job statusUpdate
信息: 
2013-10-9 11:53:30 org.apache.hadoop.mapred.Task done
信息: Task:attempt_local_0001_r_000000_0 is done. And is in the process of commiting
2013-10-9 11:53:30 org.apache.hadoop.mapred.LocalJobRunner$Job statusUpdate
信息: 
2013-10-9 11:53:30 org.apache.hadoop.mapred.Task commit
信息: Task attempt_local_0001_r_000000_0 is allowed to commit now
2013-10-9 11:53:30 org.apache.hadoop.mapred.FileOutputCommitter commitTask
信息: Saved output of task 'attempt_local_0001_r_000000_0' to hdfs://192.168.1.210:9000/user/hdfs/log_kpi/pv
2013-10-9 11:53:31 org.apache.hadoop.mapred.JobClient monitorAndPrintJob
信息:  map 100% reduce 0%
2013-10-9 11:53:33 org.apache.hadoop.mapred.LocalJobRunner$Job statusUpdate
信息: reduce > reduce
2013-10-9 11:53:33 org.apache.hadoop.mapred.Task sendDone
信息: Task 'attempt_local_0001_r_000000_0' done.
2013-10-9 11:53:34 org.apache.hadoop.mapred.JobClient monitorAndPrintJob
信息:  map 100% reduce 100%
2013-10-9 11:53:34 org.apache.hadoop.mapred.JobClient monitorAndPrintJob
信息: Job complete: job_local_0001
2013-10-9 11:53:34 org.apache.hadoop.mapred.Counters log
信息: Counters: 20
2013-10-9 11:53:34 org.apache.hadoop.mapred.Counters log
信息:   File Input Format Counters 
2013-10-9 11:53:34 org.apache.hadoop.mapred.Counters log
信息:     Bytes Read=3025757
2013-10-9 11:53:34 org.apache.hadoop.mapred.Counters log
信息:   File Output Format Counters 
2013-10-9 11:53:34 org.apache.hadoop.mapred.Counters log
信息:     Bytes Written=183
2013-10-9 11:53:34 org.apache.hadoop.mapred.Counters log
信息:   FileSystemCounters
2013-10-9 11:53:34 org.apache.hadoop.mapred.Counters log
信息:     FILE_BYTES_READ=545
2013-10-9 11:53:34 org.apache.hadoop.mapred.Counters log
信息:     HDFS_BYTES_READ=6051514
2013-10-9 11:53:34 org.apache.hadoop.mapred.Counters log
信息:     FILE_BYTES_WRITTEN=83472
2013-10-9 11:53:34 org.apache.hadoop.mapred.Counters log
信息:     HDFS_BYTES_WRITTEN=183
2013-10-9 11:53:34 org.apache.hadoop.mapred.Counters log
信息:   Map-Reduce Framework
2013-10-9 11:53:34 org.apache.hadoop.mapred.Counters log
信息:     Map output materialized bytes=217
2013-10-9 11:53:34 org.apache.hadoop.mapred.Counters log
信息:     Map input records=14619
2013-10-9 11:53:34 org.apache.hadoop.mapred.Counters log
信息:     Reduce shuffle bytes=0
2013-10-9 11:53:34 org.apache.hadoop.mapred.Counters log
信息:     Spilled Records=16
2013-10-9 11:53:34 org.apache.hadoop.mapred.Counters log
信息:     Map output bytes=2004
2013-10-9 11:53:34 org.apache.hadoop.mapred.Counters log
信息:     Total committed heap usage (bytes)=376569856
2013-10-9 11:53:34 org.apache.hadoop.mapred.Counters log
信息:     Map input bytes=3025757
2013-10-9 11:53:34 org.apache.hadoop.mapred.Counters log
信息:     SPLIT_RAW_BYTES=110
2013-10-9 11:53:34 org.apache.hadoop.mapred.Counters log
信息:     Combine input records=76
2013-10-9 11:53:34 org.apache.hadoop.mapred.Counters log
信息:     Reduce input records=8
2013-10-9 11:53:34 org.apache.hadoop.mapred.Counters log
信息:     Reduce input groups=8
2013-10-9 11:53:34 org.apache.hadoop.mapred.Counters log
信息:     Combine output records=8
2013-10-9 11:53:34 org.apache.hadoop.mapred.Counters log
信息:     Reduce output records=8
2013-10-9 11:53:34 org.apache.hadoop.mapred.Counters log
信息:     Map output records=76

用hadoop命令查看HDFS文件


~ hadoop fs -cat /user/hdfs/log_kpi/pv/part-00000

/about  5
/black-ip-list/ 2
/cassandra-clustor/     3
/finance-rhive-repurchase/      13
/hadoop-family-roadmap/ 13
/hadoop-hive-intro/     14
/hadoop-mahout-roadmap/ 20
/hadoop-zookeeper-intro/        6

这样cpcp彩票cpcp彩票我 们 就得到了,刚刚日志文件中的,指定页面的PV值。

指定页面,就像网站的站点地图一样,如果没有指定所有访问链接都会被找出来,通过“站点地图”的指定,cpcp彩票cpcp彩票我 们 可以更容易地找到,cpcp彩票cpcp彩票我 们 所需要的信息。

后面,其他的统计指标的提取思路,和PV的实现过程都是类似的,大家可以直接cpcp彩票下载 源代码,运行看到结果!!

######################################################
看文字不过瘾,作者cpcp彩票视频 讲解,请访问网站:http://onbook.me/video
######################################################

转载请注明出处:
http://whgmhg.com/hadoop-mapreduce-log-kpi/

打赏作者

开发kettle插件 环境搭建

无所不能的Java系列文章,涵盖了Java的思想,应用开发,设计模式,程序架构等,通过cpcp彩票我 的经验去诠释Java的强大。

说起Java,真的有点不知道从何说起。Java是一门全领域发展的语言,从基础的来讲有4大块,Java语法,JDK,JVM,第三方类库。官方又以面向不同应用的角度,又把JDK分为JavaME,JavaSE,JavaEE三个部分。Java可以做客户端界面,可以做中间件,可以做手机系统,可以做应用,可以做cpcp彩票工具 ,可以做cpcp彩票游戏 ,可以做算法…,Java几乎无所不能。

在Java的世界里,Java就是一切。

cpcp彩票关于 作者

  • 张丹(Conan), 程序员Java,R,PHP,Javascript
  • weibo:@Conan_Z
  • blog: http://whgmhg.com
  • email: bsspirit@gmail.com

转载请注明出处:
http://whgmhg.com/java-kettle-plugin-eclipse

kettle-plugin

前言

Kettle一个开源的ETLcpcp彩票工具 ,提供了一套界面操作的解决方案,从而代替原有的程序开发。但有时cpcp彩票cpcp彩票我 们 还需要开发自己的插件,来满足cpcp彩票cpcp彩票我 们 的业务需求。Kettle基于Eclipse的架构系统,通过JAVA作为客户端的实现。强大的ETL功能,和图形界面的操作,让Kettle成为免费的ETLcpcp彩票工具 的首选。

目录

  1. Kettle插件开发介绍
  2. 搭建kettle源代码环境
  3. 在Eclipse中构建kettle项目
  4. 在Eclipse中构建插件项目
  5. 配置插件到Kettle中
  6. Kettle项目启动
  7. 在kettle项目集成插件源代码

1. Kettle插件开发介绍

在cpcp彩票cpcp彩票我 们 做ETL工作的时候,在某些项目中往往会遇到一些特别的流程任务,kettle原有的流程处理节点已经不能满足cpcp彩票cpcp彩票我 们 的要求,这时候cpcp彩票cpcp彩票我 们 就需要定制流程处理节点了。定制流程节点主要是针对数据的管理、数据的验证和某些特别文件数据的提取。大家通过查看kettle源代码,就可以知道怎样去创建cpcp彩票你 自己的kettle插件了。

Kettle的插件开发,需要依赖于Kettle的源代码环境。

2. 搭建kettle源代码环境

1). cpcp彩票我 的系统环境

  • cpcp彩票Win 7: 64bit desktop
  • Java: 64bit 1.6.0_45

kettle源在svn上面,cpcp彩票cpcp彩票我 们 需要cpcp彩票下载 SVNcpcp彩票工具 ,然后才能cpcp彩票下载 源代码。

2). cpcp彩票下载 svncpcp彩票工具 :Subversion 1.8.3 (cpcp彩票Win dows 64-bit), 注册后cpcp彩票下载

http://www.collab.net/downloads/subversion

3). 安装Subversion

4). cpcp彩票下载 kettle源代码

~ D:\workspace\java>svn co http://source.pentaho.org/svnkettleroot/Kettle/tags/4.4.0-stable/ kettle

A    kettle\.directory
A    kettle\.project
A    kettle\cobertura
A    kettle\cobertura\cobertura.jar
A    kettle\cobertura\lib
A    kettle\cobertura\lib\log4j-1.2.9.jar
A    kettle\cobertura\lib\LICENSE
A    kettle\cobertura\lib\javancss.jar
A    kettle\cobertura\lib\junit.jar
A    kettle\cobertura\lib\cpl-v10.html
A    kettle\cobertura\lib\jakarta-oro-2.0.8.jar
A    kettle\cobertura\lib\asm-2.1.jar
A    kettle\cobertura\lib\ccl.jar
A    kettle\src
A    kettle\src\kettle-steps.xml
A    kettle\src\kettle-job-entries.xml
A    kettle\src\kettle-import-rules.xml
A    kettle\src\org
A    kettle\src\org\pentaho
A    kettle\src\org\pentaho\xul
A    kettle\src\org\pentaho\xul\swt
A    kettle\src\org\pentaho\reporting
A    kettle\src\org\pentaho\reporting\plugin
A    kettle\src\org\pentaho\hadoop
A    kettle\src\org\pentaho\hadoop\HadoopCompression.java
A    kettle\src\org\pentaho\di
A    kettle\src\org\pentaho\di\repository
A    kettle\src\org\pentaho\di\repository\kdr
A    kettle\src\org\pentaho\di\repository\kdr\KettleDatabaseRepositorySecurityProvider.java
A    kettle\src\org\pentaho\di\repository\kdr\KettleDatabaseRepositoryCreationHelper.java
A    kettle\src\org\pentaho\di\repository\kdr\KettleDatabaseRepositoryMeta.java
A    kettle\src\org\pentaho\di\repository\kdr\KettleDatabaseRepositoryBase.java
A    kettle\src\org\pentaho\di\repository\kdr\KettleDatabaseRepository.java
A    kettle\src\org\pentaho\di\repository\kdr\delegates
A    kettle\src\org\pentaho\di\repository\kdr\delegates\KettleDatabaseRepositoryBaseDelegate.java

cpcp彩票下载 的非常慢,不可以忍了。

查看SVNcpcp彩票服务 器位置:

~ ping source.pentaho.org
正在 Ping source.pentaho.org [74.205.95.173] 具有 32 字节的数据:
来自 74.205.95.173 的回复: 字节=32 时间=210ms TTL=50
来自 74.205.95.173 的回复: 字节=32 时间=209ms TTL=50
来自 74.205.95.173 的回复: 字节=32 时间=211ms TTL=50
来自 74.205.95.173 的回复: 字节=32 时间=210ms TTL=50

kettle-svn

发现SVNcpcp彩票服务 器在美国!!换另外一种思路,cpcp彩票下载 源代码!

5). 在github上面做了一个clone版

    • a. 在一台美国的vps通过svncpcp彩票下载 代码。(30scpcp彩票下载 完成)
    • b. 在github上面新建一个git项目
    • c. 增加gitignore屏蔽.svn目录
    • d. 上传到自己的github的库里面
    • e. 在cpcp彩票本地 的开发环境从githubcpcp彩票下载 代码
git clone https://github.com/bsspirit/kettle-4.4.0-stable.git

6). cpcp彩票下载 完成,执行ant

~ D:\workspace\java\kettle>ant
Buildfile: D:\workspace\java\kettle\build.xml

init:
     [echo] Init...
    [mkdir] Created dir: D:\workspace\java\kettle\build
    [mkdir] Created dir: D:\workspace\java\kettle\classes
    [mkdir] Created dir: D:\workspace\java\kettle\classes\META-INF
    [mkdir] Created dir: D:\workspace\java\kettle\classes-ui
    [mkdir] Created dir: D:\workspace\java\kettle\classes-ui\ui
    [mkdir] Created dir: D:\workspace\java\kettle\classes-core
    [mkdir] Created dir: D:\workspace\java\kettle\classes-db
    [mkdir] Created dir: D:\workspace\java\kettle\classes-dbdialog
    [mkdir] Created dir: D:\workspace\java\kettle\testClasses
    [mkdir] Created dir: D:\workspace\java\kettle\lib
    [mkdir] Created dir: D:\workspace\java\kettle\distrib
    [mkdir] Created dir: D:\workspace\java\kettle\osx-distrib
    [mkdir] Created dir: D:\workspace\java\kettle\docs\api
    [mkdir] Created dir: D:\workspace\java\kettle\webstart
    [mkdir] Created dir: D:\workspace\java\kettle\junit
    [mkdir] Created dir: D:\workspace\java\kettle\pdi-ce-distrib
     [echo] Revision set to r1

compile-core:
     [echo] Compiling Kettle CORE...
    [javac] Compiling 196 source files to D:\workspace\java\kettle\classes-core

copy-core:
     [echo] Copying core images etc to classes directory...
     [copy] Copying 73 files to D:\workspace\java\kettle\classes-core

kettle-core:
     [echo] Generating the Kettle core library kettle-core.jar ...
      [jar] Building jar: D:\workspace\java\kettle\lib\kettle-core.jar

compile-db:
     [echo] Compiling Kettle DB...
    [javac] Compiling 66 source files to D:\workspace\java\kettle\classes-db

copy-db:
     [echo] Copying db images etc to classes-db directory...
     [copy] Copying 9 files to D:\workspace\java\kettle\classes-db

kettle-db:
     [echo] Generating the Kettle DB library kettle-db.jar ...
      [jar] Building jar: D:\workspace\java\kettle\lib\kettle-db.jar

compile:
     [echo] Compiling Kettle...
    [javac] Compiling 1138 source files to D:\workspace\java\kettle\classes
    [javac] D:\workspace\java\kettle\src\org\pentaho\di\job\entry\JobEntryDialogInterface.java:37: 警告:编码 GBK 的不可
映射字符
    [javac]  *
	If the user changed any settings, the JobEntryInterface object抯 揷hanged?flag must be set to true
[javac] ^
[javac] D:\workspace\java\kettle\src\org\pentaho\di\job\entry\JobEntryDialogInterface.java:43: 警告:编码 GBK 的不可
映射字符
[javac] *The JobEntryInterface object抯 揷hanged?flag must be set to the value it had at the time the dialog o
pened
	
  • [javac] ^ [javac] D:\workspace\java\kettle\src\org\pentaho\di\job\entry\JobEntryInterface.java:75: 警告:编码 GBK 的不可映射字 符 [javac] * public void loadXML(? [javac] ^ [javac] D:\workspace\java\kettle\src\org\pentaho\di\job\entry\JobEntryInterface.java:81: 警告:编码 GBK 的不可映射字 符 [javac] * public void saveRep(? [javac] ^ [javac] D:\workspace\java\kettle\src\org\pentaho\di\job\entry\JobEntryInterface.java:89: 警告:编码 GBK 的不可映射字 符 [javac] * public void loadRep(? [javac] ^ [javac] D:\workspace\java\kettle\src\org\pentaho\di\trans\steps\mondrianinput\MondrianHelper.java:121: 警告:[deprec ation] mondrian.olap.Connection 中的 execute(mondrian.olap.Query) 已过时 [javac] result = connection.execute(query); [javac] ^ [javac] 6 警告copy: [echo] Copying images etc to classes directory... [copy] Copying 1884 files to D:\workspace\java\kettle\classes [copy] Copying 1 file to D:\workspace\java\kettle\classes\META-INF kettle: [echo] Generating the Kettle library kettle-engine.jar ... [jar] Building jar: D:\workspace\java\kettle\lib\kettle-engine.jar compile-dbdialog: [echo] Compiling Kettle DB... [javac] Compiling 5 source files to D:\workspace\java\kettle\classes-dbdialog copy-dbdialog: [echo] Copying db images etc to classes-dbdialog directory... [copy] Copying 23 files to D:\workspace\java\kettle\classes-dbdialog kettle-dbdialog: [echo] Generating the Kettle DB library kettle-dbdialog.jar ... [jar] Building jar: D:\workspace\java\kettle\lib\kettle-dbdialog.jar compile-ui: [echo] Compiling Kettle UI... [javac] Compiling 585 source files to D:\workspace\java\kettle\classes-ui [javac] D:\workspace\java\kettle\src-ui\org\pentaho\di\ui\job\entries\getpop\JobEntryGetPOPDialog.java:2102: 警告: 编码 GBK 的不可映射字符 [javac] mb.setMessage("Veuillez svp donner un nom 锟?cette entr锟絜 t锟絚he!"); [javac] ^ [javac] 1 警告 [copy] Copying 200 files to D:\workspace\java\kettle\classes-ui [copy] Copying 379 files to D:\workspace\java\kettle\classes-ui\ui kettle-ui: [echo] Generating the Kettle library kettle-ui-swt.jar ... [jar] Building jar: D:\workspace\java\kettle\lib\kettle-ui-swt.jar antcontrib.download-check: antcontrib.download: [mkdir] Created dir: C:\Users\Administrator\.subfloor\tmp [get] Getting: http://downloads.sourceforge.net/ant-contrib/ant-contrib-1.0b3-bin.zip [get] To: C:\Users\Administrator\.subfloor\tmp\antcontrib.zip [get] http://downloads.sourceforge.net/ant-contrib/ant-contrib-1.0b3-bin.zip permanently moved to http://downloads .sourceforge.net/project/ant-contrib/ant-contrib/1.0b3/ant-contrib-1.0b3-bin.zip [get] http://downloads.sourceforge.net/project/ant-contrib/ant-contrib/1.0b3/ant-contrib-1.0b3-bin.zip moved to ht tp://jaist.dl.sourceforge.net/project/ant-contrib/ant-contrib/1.0b3/ant-contrib-1.0b3-bin.zip [unzip] Expanding: C:\Users\Administrator\.subfloor\tmp\antcontrib.zip into C:\Users\Administrator\.subfloor\tmp [copy] Copying 5 files to C:\Users\Administrator\.subfloor\ant-contrib install-antcontrib: compile-plugins-standalone: [echo] Compiling Kettle Plugin kettle-gpload-plugin... [mkdir] Created dir: D:\workspace\java\kettle\src-plugins\kettle-gpload-plugin\bin\classes [javac] Compiling 5 source files to D:\workspace\java\kettle\src-plugins\kettle-gpload-plugin\bin\classes [copy] Copying 7 files to D:\workspace\java\kettle\src-plugins\kettle-gpload-plugin\bin\classes [echo] Compiling Kettle Plugin kettle-palo-plugin... [mkdir] Created dir: D:\workspace\java\kettle\src-plugins\kettle-palo-plugin\bin\classes [javac] Compiling 17 source files to D:\workspace\java\kettle\src-plugins\kettle-palo-plugin\bin\classes [copy] Copying 28 files to D:\workspace\java\kettle\src-plugins\kettle-palo-plugin\bin\classes [echo] Compiling Kettle Plugin kettle-hl7-plugin... [mkdir] Created dir: D:\workspace\java\kettle\src-plugins\kettle-hl7-plugin\bin\classes [javac] Compiling 13 source files to D:\workspace\java\kettle\src-plugins\kettle-hl7-plugin\bin\classes [copy] Copying 14 files to D:\workspace\java\kettle\src-plugins\kettle-hl7-plugin\bin\classes [echo] Compiling Kettle Plugin market... [mkdir] Created dir: D:\workspace\java\kettle\src-plugins\market\bin\classes [javac] Compiling 9 source files to D:\workspace\java\kettle\src-plugins\market\bin\classes [javac] D:\workspace\java\kettle\src-plugins\market\src\org\pentaho\di\core\market\Market.java:533: 警告:[deprecati on] org.pentaho.di.ui.core.gui.GUIResource 中的 reload() 已过时 [javac] GUIResource.getInstance().reload(); [javac] ^ [javac] 1 警告 [copy] Copying 2 files to D:\workspace\java\kettle\src-plugins\market\bin\classes compile-plugins: kettle-plugins-jar-standalone: [echo] Generating the Kettle Plugin Jar ${plugin} ... [mkdir] Created dir: D:\workspace\java\kettle\src-plugins\kettle-gpload-plugin\dist [jar] Building jar: D:\workspace\java\kettle\src-plugins\kettle-gpload-plugin\dist\kettle-gpload-plugin.jar [echo] Generating the Kettle Plugin Jar ${plugin} ... [mkdir] Created dir: D:\workspace\java\kettle\src-plugins\kettle-palo-plugin\dist [jar] Building jar: D:\workspace\java\kettle\src-plugins\kettle-palo-plugin\dist\kettle-palo-plugin.jar [echo] Generating the Kettle Plugin Jar ${plugin} ... [mkdir] Created dir: D:\workspace\java\kettle\src-plugins\kettle-hl7-plugin\dist [jar] Building jar: D:\workspace\java\kettle\src-plugins\kettle-hl7-plugin\dist\kettle-hl7-plugin.jar [echo] Generating the Kettle Plugin Jar ${plugin} ... [mkdir] Created dir: D:\workspace\java\kettle\src-plugins\market\dist [jar] Building jar: D:\workspace\java\kettle\src-plugins\market\dist\market.jar kettle-plugins-jar: kettle-plugins-standalone: [echo] Staging the Kettle plugin kettle-gpload-plugin ... [mkdir] Created dir: D:\workspace\java\kettle\src-plugins\kettle-gpload-plugin\bin\stage\kettle-gpload-plugin [copy] Copying 1 file to D:\workspace\java\kettle\src-plugins\kettle-gpload-plugin\bin\stage\kettle-gpload-plugin [mkdir] Created dir: D:\workspace\java\kettle\src-plugins\kettle-gpload-plugin\bin\stage\kettle-gpload-plugin\lib [echo] Staging the Kettle plugin kettle-palo-plugin ... [mkdir] Created dir: D:\workspace\java\kettle\src-plugins\kettle-palo-plugin\bin\stage\kettle-palo-plugin [copy] Copying 1 file to D:\workspace\java\kettle\src-plugins\kettle-palo-plugin\bin\stage\kettle-palo-plugin [mkdir] Created dir: D:\workspace\java\kettle\src-plugins\kettle-palo-plugin\bin\stage\kettle-palo-plugin\lib [copy] Copying 1 file to D:\workspace\java\kettle\src-plugins\kettle-palo-plugin\bin\stage\kettle-palo-plugin\lib [echo] Staging the Kettle plugin kettle-hl7-plugin ... [mkdir] Created dir: D:\workspace\java\kettle\src-plugins\kettle-hl7-plugin\bin\stage\kettle-hl7-plugin [copy] Copying 1 file to D:\workspace\java\kettle\src-plugins\kettle-hl7-plugin\bin\stage\kettle-hl7-plugin [mkdir] Created dir: D:\workspace\java\kettle\src-plugins\kettle-hl7-plugin\bin\stage\kettle-hl7-plugin\lib [copy] Copying 10 files to D:\workspace\java\kettle\src-plugins\kettle-hl7-plugin\bin\stage\kettle-hl7-plugin\lib [echo] Staging the Kettle plugin market ... [mkdir] Created dir: D:\workspace\java\kettle\src-plugins\market\bin\stage\market [copy] Copying 1 file to D:\workspace\java\kettle\src-plugins\market\bin\stage\market [mkdir] Created dir: D:\workspace\java\kettle\src-plugins\market\bin\stage\market\lib [copy] Copying 1 file to D:\workspace\java\kettle\src-plugins\market\bin\stage\market kettle-plugins: compileTests: [echo] Compiling Kettle tests... [javac] Compiling 122 source files to D:\workspace\java\kettle\testClasses kettle-test: [echo] Generating the Kettle library kettle-test.jar ... [jar] Building jar: D:\workspace\java\kettle\lib\kettle-test.jar distrib-nodeps: [echo] Construct the distribution package... [copy] Copying 34 files to D:\workspace\java\kettle\distrib [copy] Copied 10 empty directories to 2 empty directories under D:\workspace\java\kettle\distrib [mkdir] Created dir: D:\workspace\java\kettle\distrib\lib [copy] Copying 1 file to D:\workspace\java\kettle\distrib\lib [copy] Copying 1 file to D:\workspace\java\kettle\distrib\lib [copy] Copying 1 file to D:\workspace\java\kettle\distrib\lib [copy] Copying 1 file to D:\workspace\java\kettle\distrib\lib [copy] Copying 1 file to D:\workspace\java\kettle\distrib\lib [copy] Copying 1 file to D:\workspace\java\kettle\distrib\lib [mkdir] Created dir: D:\workspace\java\kettle\distrib\libext [copy] Copying 214 files to D:\workspace\java\kettle\distrib\libext [mkdir] Created dir: D:\workspace\java\kettle\distrib\libswt [copy] Copying 21 files to D:\workspace\java\kettle\distrib\libswt [mkdir] Created dir: D:\workspace\java\kettle\distrib\plugins [copy] Copying 15 files to D:\workspace\java\kettle\distrib\plugins [copy] Copied 11 empty directories to 3 empty directories under D:\workspace\java\kettle\distrib\plugins [copy] Copying 1 file to D:\workspace\java\kettle\distrib\plugins [copy] Copied 2 empty directories to 1 empty directory under D:\workspace\java\kettle\distrib\plugins [copy] Copying 2 files to D:\workspace\java\kettle\distrib\plugins [copy] Copying 11 files to D:\workspace\java\kettle\distrib\plugins [copy] Copying 2 files to D:\workspace\java\kettle\distrib\plugins [copy] Copied 2 empty directories to 1 empty directory under D:\workspace\java\kettle\distrib\plugins [mkdir] Created dir: D:\workspace\java\kettle\distrib\ui [copy] Copying 387 files to D:\workspace\java\kettle\distrib\ui [mkdir] Created dir: D:\workspace\java\kettle\distrib\docs [copy] Copying 354 files to D:\workspace\java\kettle\distrib\docs [mkdir] Created dir: D:\workspace\java\kettle\distrib\pwd [copy] Copying 6 files to D:\workspace\java\kettle\distrib\pwd [mkdir] Created dir: D:\workspace\java\kettle\distrib\launcher [copy] Copying 3 files to D:\workspace\java\kettle\distrib\launcher [mkdir] Created dir: D:\workspace\java\kettle\distrib\simple-jndi [copy] Copying 1 file to D:\workspace\java\kettle\distrib\simple-jndi [mkdir] Created dir: D:\workspace\java\kettle\distrib\samples [mkdir] Created dir: D:\workspace\java\kettle\distrib\samples\transformations [mkdir] Created dir: D:\workspace\java\kettle\distrib\samples\jobs [mkdir] Created dir: D:\workspace\java\kettle\distrib\samples\transformations\output [mkdir] Created dir: D:\workspace\java\kettle\distrib\samples\jobs\output [copy] Copying 248 files to D:\workspace\java\kettle\distrib\samples distrib: default: BUILD SUCCESSFUL Total time: 1 minute 29 seconds
  • 虽然,有一些警告,但是build成功!!

    3. 在Eclipse中构建kettle项目

    7). 把kettle项目,导入到Eclipse中。

    kettle-eclipse

    4. 在Eclipse中构建插件项目

    8). 构建插件项目:cpcp彩票我 可以基于一个模板去构建插件。

    cpcp彩票下载 kettle-TemplatePlugin项目

    wget http://www.ahuoo.com/download/TemplateStepPlugin.rar

    9). 解压后导入到eclipse工程:kettle-TemplatePlugin

    复制类库

    • 从kettle项目中,复制lib的*.jar到kettle-TemplatePlugin中的libext目录
    • 从kettle项目中,复制libswt/cpcp彩票Win 64的swt.js到kettle-TemplatePlugin中的libswt/cpcp彩票Win 64目录

    10). 刚才复制的类库加入项目依赖

    kettle-eclipse-template

    11). 在kettle-TemplatePlugin项目,执行ant

    
    ~ D:\workspace\java\kettle-TemplatePlugin>ant
    Buildfile: D:\workspace\java\kettle-TemplatePlugin\build.xml
    
    init:
         [echo] Init...
    
    compile:
         [echo] Compiling Jasper Reporting Plugin...
        [javac] D:\workspace\java\kettle-TemplatePlugin\build.xml:40: warning: 'includeantruntime' was not set, defaulting t
    o build.sysclasspath=last; set to false for repeatable builds
    
    copy:
         [echo] Copying images etc to classes directory...
    
    lib:
         [echo] Generating the Jasper Reporting library TemplateStepPlugin.jar ...
          [jar] Building jar: D:\workspace\java\kettle-TemplatePlugin\lib\TemplateStepPlugin.jar
    
    distrib:
         [echo] Copying libraries to distrib directory...
         [copy] Copying 1 file to D:\workspace\java\kettle-TemplatePlugin\distrib
    
    deploy:
         [echo] deploying plugin...
    
    default:
    
    BUILD SUCCESSFUL
    Total time: 0 seconds
    

    12). 修改distrib目录的文件

    • icon.png:图标文件
    • plugin.xml: 插件的配置文件(4.4以后的版本,可以去掉)
    • TemplateStepPlugin.jar:是通过ant生成的文件

    5. 配置插件到Kettle中

    13). 把kettle-TemplatePlugin发布到kettle中

    a. 在kettle是工程增加2个目录

    
    ~ mkdir D:\workspace\java\kettle\distrib\plugins\steps\myPlugin
    ~ mkdir D:\workspace\java\kettle\plugins\steps\myPlugin
    

    b. 修改kettle-TemplatePlugin的build.xml文件

    
    <property name="deploydir" location="D:\workspace\java\kettle\distrib\plugins\steps\myPlugin"/>
    <property name="projectdir" location="D:\workspace\java\kettle\plugins\steps\myPlugin"/>
    
    <fileset dir="${libswt}/cpcp彩票Win
    64/" includes="*.jar"/>
    
    <target name="deploy" depends="distrib" description="Deploy distribution..." >
    <echo>deploying plugin...</echo>
    <copy todir="${deploydir}">
    <fileset dir="${distrib}" includes="**/*.*"/>
    </copy>
    
    <copy todir="${projectdir}">
    <fileset dir="${distrib}" includes="**/*.*"/>
    </copy>
    </target>
    

    c. kettle-TemplatePlugin项目执行ant

    
    D:\workspace\java\kettle-TemplatePlugin>ant
    Buildfile: D:\workspace\java\kettle-TemplatePlugin\build.xml
    
    init:
         [echo] Init...
    
    compile:
         [echo] Compiling Jasper Reporting Plugin...
        [javac] D:\workspace\java\kettle-TemplatePlugin\build.xml:43: warning: 'includeantruntime' was not set, defaulting t
    o build.sysclasspath=last; set to false for repeatable builds
    
    copy:
         [echo] Copying images etc to classes directory...
    
    lib:
         [echo] Generating the Jasper Reporting library TemplateStepPlugin.jar ...
    
    distrib:
         [echo] Copying libraries to distrib directory...
    
    deploy:
         [echo] deploying plugin...
         [copy] Copying 3 files to D:\workspace\java\kettle\distrib\plugins\steps\myPlugin
         [copy] Copying 6 files to D:\workspace\java\kettle\distrib\plugins\steps\myPlugin
         [copy] Copying 3 files to D:\workspace\java\kettle\plugins\steps\myPlugin
    
    default:
    
    BUILD SUCCESSFUL
    Total time: 0 seconds
    

    14). 在kettle中查看目录:D:\workspace\java\kettle\distrib\plugins\steps\myPlugin

    kettle-dist

    kettle-TemplatePlugin项目的3个文件,已经被放到了正确的位置

    6. 命令行项目启动

    15). 命令行启动kettle
    a. 修改Spoon启动命令,不开启新窗口,直接以JAVA运行

    
    ~ vi D:\workspace\java\kettle\distrib\Spoon.bat
    
    @echo on
    REM start "Spoon" "%_PENTAHO_JAVA%" %OPT% -jar launcher\launcher.jar -lib ..\%LIBSPATH% %_cmdline%
    java %OPT% -jar launcher\launcher.jar -lib ..\%LIBSPATH% %_cmdline%
    

    b. 运行Spoon.bat命令

    
    ~ D:\workspace\java\kettle\distrib>Spoon.bat
    
    DEBUG: Using JAVA_HOME
    DEBUG: _PENTAHO_JAVA_HOME=D:\toolkit\java\jdk6
    DEBUG: _PENTAHO_JAVA=D:\toolkit\java\jdk6\bin\javaw
    
    D:\workspace\java\kettle\distrib>REM start "Spoon" "D:\toolkit\java\jdk6\bin\javaw" "-Xmx512m" "-XX:MaxPermSize=256m" "-
    Djava.library.path=libswt\cpcp彩票Win
    64" "-DKETTLE_HOME=" "-DKETTLE_REPOSITORY=" "-DKETTLE_USER=" "-DKETTLE_PASSWORD=" "-DKETTLE
    _PLUGIN_PACKAGES=" "-DKETTLE_LOG_SIZE_LIMIT=" -jar launcher\launcher.jar -lib ..\libswt\cpcp彩票Win
    64
    
    D:\workspace\java\kettle\distrib>java "-Xmx512m" "-XX:MaxPermSize=256m" "-Djava.library.path=libswt\cpcp彩票Win
    64" "-DKETTLE_HOM
    E=" "-DKETTLE_REPOSITORY=" "-DKETTLE_USER=" "-DKETTLE_PASSWORD=" "-DKETTLE_PLUGIN_PACKAGES=" "-DKETTLE_LOG_SIZE_LIMIT="
    -jar launcher\launcher.jar -lib ..\libswt\cpcp彩票Win
    64
    INFO  21-09 12:26:35,717 - Spoon - Logging goes to file:///C:/Users/ADMINI~1/AppData/Local/Temp/spoon_0042f442-2276-11e3
    -bf49-6be1282e1ee0.log
    INFO  21-09 12:26:36,655 - Spoon - 要求资源库
    INFO  21-09 12:26:36,795 - RepositoriesMeta - Reading repositories XML file: C:\Users\Administrator\.kettle\repositories
    .xml
    INFO  21-09 12:26:37,783 - Version checker - OK
    

    16). 查看kettle-TemplatePlugin 插件
    kettle-debug1

    7. 在kettle项目集成插件源代码

    17). 通过Eclipse的 link source功能,连接kettle-TemplatePlugin项目
    a. 在kettle项目中,选择link source
    kettle-link-source

    b. 在kettle项目中编程
    kettle-source

    18). 通过Eclipse启动kettle
    a. 在Eclipse中配置启动Main Class: org.pentaho.di.ui.spoon.Spoon
    kettle-run1

    b.增加64位的swt.jar类库
    kettle-run2

    c. 在Eclipse中启动kettle
    kettle-run3

    19). 通过Eclipse调用kettle-TemplatePlugin
    a. 修改TemplateStepDialog.java,找到opencpcp彩票方法 ,增加一行输出

    
    public String open() { 
    System.out.println(“Open a dialog!!!”);
    
    ...
    }
    

    b. 在Eclipse中,通过debug启动:org.pentaho.di.ui.spoon.Spoon
    双点Template Plugin的图标,看到日志显示”Open a dialog!!!“

    kettle-debug1

    这样cpcp彩票cpcp彩票我 们 就构建好了,kettle插件的开发环境。接下来,cpcp彩票cpcp彩票我 们 就可以进行插件开发了!!

    转载请注明出处:
    http://whgmhg.com/java-kettle-plugin-eclipse

    打赏作者

    Nova安装攻略

    自己搭建VPS系列文章,介绍了如何利用自己的计算机资源,通过虚拟化cpcp彩票技术 搭建VPS。

    在cpcp彩票互联网 2.0时代,每个人都有自己的博客,还有很多专属于自己的cpcp彩票互联网 应用。这些应用大部分都是cpcp彩票互联网 cpcp彩票公司 提供的。对于一些有能力的开发人员(geek)来说,他们希望做一些自己的应用,可以用到最新最炫的cpcp彩票技术 ,并且有自己的域名,有自己的cpcp彩票服务 器。这时就要去租一些cpcp彩票互联网 上的VPS主机。VPS主机就相当于是一台远程的计算机,可以部署自己的应用程序,然后申请一个域名,就可以正式发布在cpcp彩票互联网 上了。cpcp彩票本站 “@晒粉丝” 就使用的Linode主机VPS在美国达拉斯机房。

    其实,VPS还可以自己搭建的。只要cpcp彩票cpcp彩票我 们 有一台高性能的cpcp彩票服务 器,一个IPcpcp彩票地址 ,一个路由。可以把一台高性能的cpcp彩票服务 器,很快的变成5台,10台,20台的虚拟VPS。cpcp彩票cpcp彩票我 们 就可以在自己的VPS上面的,发布各种的应用,还可以把剩余的cpcp彩票服务 器资源租给其他的cpcp彩票互联网 使用者。 本系列文章将分为以下几个部分介绍:“虚拟化cpcp彩票技术 选型”,“动态IP解析”,“在Ubuntu上安装KVM并搭建虚拟环境”,“给KVM虚拟机增加硬盘”,“Nova安装攻略”,“VPS内网的网络架构设计”,“VPS租用云cpcp彩票服务 ”。

    cpcp彩票关于 作者:

    • 张丹(Conan), 程序员Java,R,PHP,Javascript
    • weibo:@Conan_Z
    • blog: http://whgmhg.com
    • email: bsspirit@gmail.com

    转载请注明出处:
    http://whgmhg.com/vps-nova-setup/

    openstack

    前言

    Nova是Openstack一个重要的组成部分,最核心的功能就是对虚拟机进行管理(kvm, qemu, xen, vmware, virtual box)。

    本次安装实验对Linux Ubuntu系统版本有严格的要求,必须是12.04 LTS。其他版本模拟实验均不成功,请大家严格执行。

    本次实验的参考图书:
    OpenStack Cloud Computing Cookbook
    Chapter 1: Starting OpenStack Compute

    目录

    1. nova安装方案
    2. VirtrulBox虚拟机环境
    3. 操作系统环境
    4. cpcp彩票软件 包依赖
    5. nova配置
    6. 创建nova实例
    7. 登陆云实例
    8. 错误总结

    1. nova安装方案

    使用VirtrulBox虚拟机,嵌套qemu虚拟机。nova安装在VirtrulBox环境中,云实例则安装qemu环境中。通过nova管理qemu云实例。

    2. VirtrulBox虚拟机环境

    VirtrulBox虚拟机: 6G内存,4核CPU, Linux Ubuntu 12.04 LTS
    CPU支持VT-x/AMD-V,嵌套分页,PAE/NX

    nova1

    3张虚拟网卡:

    • 网卡1:桥接网卡
    • 网卡2:Host-Only
    • 网卡3:Host-Only

    vbox2

    Host-Only网卡在虚拟机中的全局设置:
    vbox1

    3. 操作系统环境

    再次强调:本次实验的Ubuntu版本,必须12.04 LTS

    
    ~ uname -a
    Linux nova 3.2.0-23-generic #36-Ubuntu SMP Tue Apr 10 20:39:51 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux
    
    ~ cat /etc/issue
    Ubuntu 12.04 LTS \n \l
    
    ~ ifconfig
    eth0      Link encap:Ethernet  HWaddr 08:00:27:90:e8:19
              inet addr:192.168.1.200  Bcast:192.168.1.255  Mask:255.255.255.0
              inet6 addr: fe80::a00:27ff:fe90:e819/64 Scope:Link
              UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
              RX packets:162 errors:0 dropped:0 overruns:0 frame:0
              TX packets:132 errors:0 dropped:0 overruns:0 carrier:0
              collisions:0 txqueuelen:1000
              RX bytes:16399 (16.3 KB)  TX bytes:22792 (22.7 KB)
    
    lo        Link encap:Local Loopback
              inet addr:127.0.0.1  Mask:255.0.0.0
              inet6 addr: ::1/128 Scope:Host
              UP LOOPBACK RUNNING  MTU:16436  Metric:1
              RX packets:0 errors:0 dropped:0 overruns:0 frame:0
              TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
              collisions:0 txqueuelen:0
              RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)
    

    创建openstack用户
    创建新用户openstack,密码openstack,增加到sudo组

    
    ~ sudo useradd openstack
    ~ sudo passwd openstack  
    ~ sudo adduser openstack sudo
    
    ~ sudo mkdir -p /home/openstack
    ~ sudo chown openstack:openstack /home/openstack
    
    ~ ls -l /home
    drwxr-xr-x 8 conan     conan     4096 Jul 13 17:07 conan
    drwxr-xr-x 2 openstack openstack 4096 Jul 13 17:21 openstack
    

    用openstack账号重新登陆,测试sudo命令
    以下所有操作请使用openstack用户

    
    ssh openstack@192.168.1.200
    
    openstack@u1:~$ whoami
    openstack
    
    openstack@u1:~$ sudo -i  
    [sudo] password for openstack:
    
    root@u1:~# whoami
    root
    
    root@u1:~# exit
    logout
    

    虚拟机网卡配置

    
    ~ sudo vi /etc/network/interfaces
    
    auto lo
    iface lo inet loopback
    
    auto eth0
    iface eth0 inet static
    address 192.168.1.200
    netmask 255.255.255.0
    network 192.168.1.0
    broadcase 192.168.1.255
    gateway 192.168.1.1
    
    #public interface
    auto eth1
    iface eth1 inet static
    address 172.16.0.1
    netmask 255.255.0.0
    network 172.16.0.0
    broadcase 172.16.255.255
    
    #private interface
    auto eth2
    iface eth2 inet manual
    up ifconfig eth2 up
    

    重新启动网卡

    
    ~ sudo /etc/init.d/networking restart
    
     * Running /etc/init.d/networking restart is deprecated because it may not enable again some interfaces
     * Reconfiguring network interfaces...                                                 ssh stop/waiting
    ssh start/running, process 2040
    ssh stop/waiting
    ssh start/running, process 2082
    ssh stop/waiting
    ssh start/running, process 2121
                                                                                [ OK ]
    ~ ifconfig
    eth0      Link encap:Ethernet  HWaddr 08:00:27:90:e8:19
              inet addr:192.168.1.200  Bcast:192.168.1.255  Mask:255.255.255.0
              inet6 addr: fe80::a00:27ff:fe90:e819/64 Scope:Link
              UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
              RX packets:3408 errors:0 dropped:0 overruns:0 frame:0
              TX packets:2244 errors:0 dropped:0 overruns:0 carrier:0
              collisions:0 txqueuelen:1000
              RX bytes:3321759 (3.3 MB)  TX bytes:250703 (250.7 KB)
    
    eth1      Link encap:Ethernet  HWaddr 08:00:27:4e:06:74
              inet addr:172.16.0.1  Bcast:172.16.255.255  Mask:255.255.0.0
              inet6 addr: fe80::a00:27ff:fe4e:674/64 Scope:Link
              UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
              RX packets:18 errors:0 dropped:0 overruns:0 frame:0
              TX packets:6 errors:0 dropped:0 overruns:0 carrier:0
              collisions:0 txqueuelen:1000
              RX bytes:1656 (1.6 KB)  TX bytes:468 (468.0 B)
    
    eth2      Link encap:Ethernet  HWaddr 08:00:27:5a:b1:1f
              inet6 addr: fe80::a00:27ff:fe5a:b11f/64 Scope:Link
              UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
              RX packets:33 errors:0 dropped:0 overruns:0 frame:0
              TX packets:5 errors:0 dropped:0 overruns:0 carrier:0
              collisions:0 txqueuelen:1000
              RX bytes:3156 (3.1 KB)  TX bytes:378 (378.0 B)
    
    lo        Link encap:Local Loopback
              inet addr:127.0.0.1  Mask:255.0.0.0
              inet6 addr: ::1/128 Scope:Host
              UP LOOPBACK RUNNING  MTU:16436  Metric:1
              RX packets:0 errors:0 dropped:0 overruns:0 frame:0
              TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
              collisions:0 txqueuelen:0
              RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)
    

    DNS配置

    
    ~ vi /etc/resolv.conf 
    nameserver 8.8.8.8
    
    ~ ping www.163.com
    PING 163.xdwscache.glb0.lxdns.com (101.23.128.17) 56(84) bytes of data.
    64 bytes from 101.23.128.17: icmp_req=1 ttl=53 time=20.3 ms
    64 bytes from 101.23.128.17: icmp_req=2 ttl=53 time=18.5 ms
    

    4. cpcp彩票软件 包依赖

    更新源

    
    ~ sudo vi /etc/apt/sources.list
    
    deb http://mirrors.163.com/ubuntu/ precise main universe restricted multiverse 
    deb-src http://mirrors.163.com/ubuntu/ precise main universe restricted multiverse 
    deb http://mirrors.163.com/ubuntu/ precise-security universe main multiverse restricted 
    deb-src http://mirrors.163.com/ubuntu/ precise-security universe main multiverse restricted 
    deb http://mirrors.163.com/ubuntu/ precise-updates universe main multiverse restricted 
    deb http://mirrors.163.com/ubuntu/ precise-proposed universe main multiverse restricted 
    deb-src http://mirrors.163.com/ubuntu/ precise-proposed universe main multiverse restricted 
    deb http://mirrors.163.com/ubuntu/ precise-backports universe main multiverse restricted 
    deb-src http://mirrors.163.com/ubuntu/ precise-backports universe main multiverse restricted 
    deb-src http://mirrors.163.com/ubuntu/ precise-updates universe main multiverse restricted
    

    nova相关cpcp彩票软件 安装

    
    ~ sudo apt-get update
    ~ sudo apt-get -y install rabbitmq-server nova-api nova-objectstore nova-scheduler nova-network nova-compute nova-cert glance qemu unzip
    ~ sudo apt-get install pm-utils
    

    注:如果没有安装pm-utils,libvirtd日志中会有错误

    
    Cannot find 'pm-is-supported' in path: No such file or directory
    

    查看系统进程

    
    ~ pstree
    init─┬─acpid
         ├─atd
         ├─beam.smp─┬─cpu_sup
         │          ├─inet_gethost───inet_gethost
         │          └─38*[{beam.smp}]
         ├─cron
         ├─dbus-daemon
         ├─dhclient3
         ├─dnsmasq
         ├─epmd
         ├─5*[getty]
         ├─irqbalance
         ├─2*[iscsid]
         ├─libvirtd───10*[{libvirtd}]
         ├─login───bash
         ├─rsyslogd───3*[{rsyslogd}]
         ├─sshd───sshd───bash
         ├─sshd─┬─sshd───sshd───sh───bash───pstree
         │      └─sshd───sshd───sh───bash
         ├─su───glance-api
         ├─su───glance-registry
         ├─su───nova-api
         ├─su───nova-cert
         ├─su───nova-network
         ├─su───nova-objectstor
         ├─su───nova-scheduler
         ├─su───nova-compute
         ├─udevd───2*[udevd]
         ├─upstart-socket-
         ├─upstart-udev-br
         └─whoopsie───{whoopsie}
    

    安装ntp时间同步cpcp彩票服务

    
    ~ sudo apt-get -y install ntp
    
    #修改配置文件
    ~ sudo vi /etc/ntp.conf
    
    #Replace ntp.ubuntu.com with an NTP server on your network
    server ntp.ubuntu.com
    server 127.127.1.0
    fudge 127.127.1.0 stratum 10
    
    #重启ntp
    ~ sudo service ntp restart
    
    ~ ps -aux|grep ntp
    ntp       6990  0.0  0.0  37696  2180 ?        Ss   19:50   0:00 /usr/sbin/ntpd -p /var/run/ntpd.pid -g -u 113:120
    
    #查看系统当前时间
    ~ date
    Sat Jul 13 19:50:12 CST 2013
    

    安装MySQL

    
    ~ sudo apt-get install mysql-server
    
    #配置允许其他计算访问
    ~ sudo sed -i 's/127.0.0.1/0.0.0.0/g' /etc/mysql/my.cnf
    ~ sudo service mysql restart
    
    #创建nova数据库及配置nova用户
    ~ MYSQL_PASS=mysql
    ~ mysql -uroot -p$MYSQL_PASS -e 'CREATE DATABASE nova;'
    ~ mysql -uroot -p$MYSQL_PASS -e "GRANT ALL PRIVILEGES ON nova.* TO 'nova'@'%'"
    ~ mysql -uroot -p$MYSQL_PASS -e "SET PASSWORD FOR 'nova'@'%' = PASSWORD('openstack');"
    

    5. nova配置

    修改nova.conf配置文件
    可以去掉–verbose,只是为了打印cpcp彩票更多 的日志信息。

    
    ~ sudo vi /etc/nova/nova.conf
    
    --dhcpbridge_flagfile=/etc/nova/nova.conf
    --dhcpbridge=/usr/bin/nova-dhcpbridge
    --logdir=/var/log/nova
    --state_path=/var/lib/nova
    --lock_path=/var/lock/nova
    --force_dhcp_release
    --iscsi_helper=tgtadm
    --libvirt_use_virtio_for_bridges
    --connection_type=libvirt
    --root_helper=sudo nova-rootwrap
    --verbose
    --ec2_private_dns_show_ip
    --sql_connection=mysql://nova:openstack@172.16.0.1/nova
    --use_deprecated_auth
    --s3_host=172.16.0.1
    --rabbit_host=172.16.0.1
    --ec2_host=172.16.0.1
    --ec2_dmz_host=172.16.0.1
    --public_interface=eth1
    --image_service=nova.image.glance.GlanceImageService
    --glance_api_servers=172.16.0.1:9292
    --auto_assign_floating_ip=true
    --scheduler_default_filters=AllHostsFilter
    

    修改VMM设置,nova-compute.conf
    这里要使用qemu虚拟机,如果cpcp彩票cpcp彩票我 们 不是嵌套虚拟化的模式,建议使用kvm虚拟机。

    
    ~ sudo vi /etc/nova/nova-compute.conf
    
    --libvirt_type=qemu
    

    把nova源数据写入MySQL

    
    ~ sudo nova-manage db sync
    
    2013-07-14 21:18:56 DEBUG nova.utils [-] backend  from (pid=8750) __get_backend /usr/lib/python2.7/dist-packages/nova/utils.py:663
    2013-07-14 21:19:06 WARNING nova.utils [-] /usr/lib/python2.7/dist-packages/sqlalchemy/pool.py:639: SADeprecationWarning: The 'listeners' argument to Pool (and create_engine()) is deprecated.  Use event.listen().
      Pool.__init__(self, creator, **kw)
    
    2013-07-14 21:19:06 WARNING nova.utils [-] /usr/lib/python2.7/dist-packages/sqlalchemy/pool.py:145: SADeprecationWarning: Pool.add_listener is deprecated.  Use event.listen()
      self.add_listener(l)
    
    2013-07-14 21:19:06 AUDIT nova.db.sqlalchemy.fix_dns_domains [-] Applying database fix for Essex dns_domains table.
    

    创建openstack私有网络

    
    ~ sudo nova-manage network create vmnet --fixed_range_v4=10.0.0.0/8 --network_size=64 --bridge_interface=eth2
    
    2013-07-14 21:19:34 DEBUG nova.utils [req-152fee41-ddc9-4ac5-902d-fb93f7af67a8 None None] backend  from (pid=8807) __get_backend /usr/lib/python2.7/dist-packages/nova/utils.py:663
    

    设置openstack浮动IP

    
    ~ sudo nova-manage floating create --ip_range=172.16.1.0/24
    
    2013-07-14 21:19:48 DEBUG nova.utils [req-7171e8bc-6542-40d2-b24c-b4593505fd87 None None] backend  from (pid=8814) __get_backend /usr/lib/python2.7/dist-packages/nova/utils.py:663
    

    查看mysql中nova数据库

    
    ~ mysql -uroot -p
    
    mysql> show databases;
    +--------------------+
    | Database           |
    +--------------------+
    | information_schema |
    | ape_biz            |
    | mysql              |
    | nova               |
    | performance_schema |
    | test               |
    +--------------------+
    6 rows in set (0.00 sec)
    
    mysql> use nova
    
    mysql> show tables;
    +-------------------------------------+
    | Tables_in_nova                      |
    +-------------------------------------+
    | agent_builds                        |
    | aggregate_hosts                     |
    | aggregate_metadata                  |
    | aggregates                          |
    | auth_tokens                         |
    | block_device_mapping                |
    | bw_usage_cache                      |
    | cells                               |
    | certificates                        |
    | compute_nodes                       |
    | console_pools                       |
    | consoles                            |
    | dns_domains                         |
    | fixed_ips                           |
    | floating_ips                        |
    | instance_actions                    |
    | instance_faults                     |
    | instance_info_caches                |
    | instance_metadata                   |
    | instance_type_extra_specs           |
    | instance_types                      |
    | instances                           |
    | iscsi_targets                       |
    | key_pairs                           |
    | migrate_version                     |
    | migrations                          |
    | networks                            |
    | projects                            |
    | provider_fw_rules                   |
    | quotas                              |
    | s3_images                           |
    | security_group_instance_association |
    | security_group_rules                |
    | security_groups                     |
    | services                            |
    | sm_backend_config                   |
    | sm_flavors                          |
    | sm_volume                           |
    | snapshots                           |
    | user_project_association            |
    | user_project_role_association       |
    | user_role_association               |
    | users                               |
    | virtual_interfaces                  |
    | virtual_storage_arrays              |
    | volume_metadata                     |
    | volume_type_extra_specs             |
    | volume_types                        |
    | volumes                             |
    +-------------------------------------+
    49 rows in set (0.00 sec)
    

    重启cpcp彩票服务 nova,libvirt,glance

    
    #停止
    ~ sudo stop nova-compute
    ~ sudo stop nova-network
    ~ sudo stop nova-api
    ~ sudo stop nova-scheduler
    ~ sudo stop nova-objectstore
    ~ sudo stop nova-cert
    
    ~ sudo stop libvirt-bin
    ~ sudo stop glance-registry
    ~ sudo stop glance-api
    
    #启动
    ~ sudo start nova-compute
    ~ sudo start nova-network
    ~ sudo start nova-api
    ~ sudo start nova-scheduler
    ~ sudo start nova-objectstore
    ~ sudo start nova-cert
    
    ~ sudo start libvirt-bin
    ~ sudo start glance-registry
    ~ sudo start glance-api
    
    #查看系统进程树
    ~ pstree
    init─┬─acpid
         ├─atd
         ├─beam.smp─┬─cpu_sup
         │          ├─inet_gethost───inet_gethost
         │          └─38*[{beam.smp}]
         ├─cron
         ├─dbus-daemon
         ├─dhclient3
         ├─dnsmasq
         ├─epmd
         ├─5*[getty]
         ├─irqbalance
         ├─2*[iscsid]
         ├─libvirtd───10*[{libvirtd}]
         ├─login───bash
         ├─mysqld───19*[{mysqld}]
         ├─ntpd
         ├─rsyslogd───3*[{rsyslogd}]
         ├─sshd───sshd───bash
         ├─sshd─┬─sshd───sshd───sh───bash───pstree
         │      └─sshd───sshd───sh───bash
         ├─su───glance-registry
         ├─su───glance-api
         ├─su───nova-network
         ├─su───nova-api
         ├─su───nova-scheduler
         ├─su───nova-objectstor
         ├─su───nova-cert
         ├─su───nova-compute
         ├─udevd───2*[udevd]
         ├─upstart-socket-
         ├─upstart-udev-br
         └─whoopsie───{whoopsie}
    

    创建nova用户,角色,项目

    
    #创建用户
    ~ sudo nova-manage user admin openstack
    2013-07-14 21:22:00 DEBUG nova.utils [req-6a95dd03-04db-4f60-9198-d77a4d4936e8 None None] backend  from (pid=9254) __get_backend /usr/lib/python2.7/dist-packages/nova/utils.py:663
    2013-07-14 21:22:00 AUDIT nova.auth.manager [-] Created user openstack (admin: True)
    export EC2_ACCESS_KEY=62ff82fa-74a9-4ffb-a420-ea190e893863
    export EC2_SECRET_KEY=f1f32aed-85fe-406d-8f28-bbf02d7a7134
    
    #创建角色
    ~ sudo nova-manage role add openstack cloudadmin
    2013-07-14 21:22:15 AUDIT nova.auth.manager [-] Adding sitewide role cloudadmin to user openstack
    2013-07-14 21:22:15 DEBUG nova.utils [req-a9d8cdfa-263c-4d6a-8c69-d6571aabee00 None None] backend  from (pid=9262) __get_backend /usr/lib/python2.7/dist-packages/nova/utils.py:663
    
    #创建项目
    ~ sudo nova-manage project create cookbook openstack
    2013-07-14 21:22:34 DEBUG nova.utils [req-3a340500-6674-439e-ac95-e28954637cf5 None None] backend  from (pid=9395) __get_backend /usr/lib/python2.7/dist-packages/nova/utils.py:663
    2013-07-14 21:22:34 AUDIT nova.auth.manager [-] Created project cookbook with manager openstack
    
    ~ sudo nova-manage project zipfile cookbook openstack
    2013-07-14 21:22:49 DEBUG nova.utils [req-429e7839-6009-4862-98c5-af01ceac9cee None None] backend  from (pid=9402) __get_backend /usr/lib/python2.7/dist-packages/nova/utils.py:663
    2013-07-14 21:22:49 DEBUG nova.utils [-] Running cmd (subprocess): openssl genrsa -out /tmp/tmprZNpYT/temp.key 1024 from (pid=9402) execute /usr/lib/python2.7/dist-packages/nova/utils.py:224
    2013-07-14 21:22:49 DEBUG nova.utils [-] Running cmd (subprocess): openssl req -new -key /tmp/tmprZNpYT/temp.key -out /tmp/tmprZNpYT/temp.csr -batch -subj /C=US/ST=California/O=OpenStack/OU=NovaDev/CN=cookbook-openstack-2013-07-14T13:22:49Z from (pid=9402) execute /usr/lib/python2.7/dist-packages/nova/utils.py:224
    2013-07-14 21:22:49 DEBUG nova.crypto [-] Flags path: /var/lib/nova/CA from (pid=9402) _sign_csr /usr/lib/python2.7/dist-packages/nova/crypto.py:290
    2013-07-14 21:22:49 DEBUG nova.utils [-] Running cmd (subprocess): openssl ca -batch -out /tmp/tmpJZvmrM/outbound.csr -config ./openssl.cnf -infiles /tmp/tmpJZvmrM/inbound.csr from (pid=9402) execute /usr/lib/python2.7/dist-packages/nova/utils.py:224
    2013-07-14 21:22:49 DEBUG nova.utils [-] Running cmd (subprocess): openssl x509 -in /tmp/tmpJZvmrM/outbound.csr -serial -noout from (pid=9402) execute /usr/lib/python2.7/dist-packages/nova/utils.py:224
    2013-07-14 21:22:49 WARNING nova.auth.manager [-] No vpn data for project cookbook
    

    安装命令行cpcp彩票工具 及配置

    
    ~ sudo apt-get install euca2ools python-novaclient unzip
    
    ~ pwd
    /home/openstack
    
    ~ ls -l
    -rw-r--r-- 1 root root 5930 Jul 13 20:38 nova.zip
    
    #解压cpcp彩票工具
    包
    ~ unzip nova.zip
    Archive:  nova.zip
    extracting: novarc
    extracting: pk.pem
    extracting: cert.pem
    extracting: cacert.pem
    
    #增加环境变量
    ~ . novarc
    
    #查看环境变量
    ~ env
    LC_PAPER=en_US
    LC_ADDRESS=en_US
    LC_MONETARY=en_US
    SHELL=/bin/sh
    TERM=xterm
    SSH_CLIENT=192.168.1.11 60377 22
    LC_NUMERIC=en_US
    EUCALYPTUS_CERT=/home/openstack/cacert.pem
    OLDPWD=/var/log/libvirt
    SSH_TTY=/dev/pts/0
    LC_ALL=en_US.UTF-8
    USER=openstack
    LC_TELEPHONE=en_US
    NOVA_CERT=/home/openstack/cacert.pem
    EC2_SECRET_KEY=6f964a16-6036-44ef-bdf3-23dff94f5b94
    NOVA_PROJECT_ID=cookbook
    EC2_USER_ID=42
    PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/home/conan/toolkit/jdk16/bin:/home/conan/toolkit/cassandra/bin
    MAIL=/var/mail/openstack
    NOVA_VERSION=1.1
    LC_IDENTIFICATION=en_US
    NOVA_USERNAME=openstack
    PWD=/home/openstack
    JAVA_HOME=/home/conan/toolkit/jdk16
    LANG=en_US.UTF-8
    CASSANDRA_HOME=/home/conan/toolkit/cassandra125
    LC_MEASUREMENT=en_US
    NOVA_API_KEY=openstack
    NOVA_URL=http://172.16.0.1:8774/v1.1/
    SHLVL=1
    HOME=/home/openstack
    LANGUAGE=en_US:en
    EC2_URL=http://172.16.0.1:8773/services/Cloud
    LOGNAME=openstack
    SSH_CONNECTION=192.168.1.11 60377 192.168.1.200 22
    EC2_ACCESS_KEY=openstack:cookbook
    EC2_PRIVATE_KEY=/home/openstack/pk.pem
    DISPLAY=localhost:10.0
    S3_URL=http://172.16.0.1:3333
    LC_TIME=en_US
    EC2_CERT=/home/openstack/cert.pem
    LC_NAME=en_US
    _=/usr/bin/env
    
    #创建密钥
    ~ euca-add-keypair openstack > openstack.pem
    ~ chmod 0600 *.pem
    
    ~ ls -l
    -rw------- 1 openstack openstack 1029 Jul 13 20:38 cacert.pem
    -rw------- 1 openstack openstack 2515 Jul 13 20:38 cert.pem
    -rw------- 1 openstack openstack 1113 Jul 13 20:38 novarc
    -rw-r--r-- 1 root      root      5930 Jul 13 20:38 nova.zip
    -rw------- 1 openstack openstack  954 Jul 13 20:50 openstack.pem
    -rw------- 1 openstack openstack  887 Jul 13 20:38 pk.pem
    

    查看novacpcp彩票服务

    
    ~ euca-describe-availability-zones verbose
    AVAILABILITYZONE        nova    available
    AVAILABILITYZONE        |- nova
    AVAILABILITYZONE        | |- nova-scheduler     enabled :-) 2013-07-14 13:23:53
    AVAILABILITYZONE        | |- nova-compute       enabled :-) 2013-07-14 13:23:53
    AVAILABILITYZONE        | |- nova-cert  enabled :-) 2013-07-14 13:23:53
    AVAILABILITYZONE        | |- nova-network       enabled :-) 2013-07-14 13:23:53
    

    6. 创建nova实例

    上传云实例到虚拟主机
    ubuntu-12.04-server-cloudimg-i386.tar.gz文件,请自己cpcp彩票下载 :http://uec-images.ubuntu.com/releases/precise/release/ubuntu-12.04-server-cloudimg-i386.tar.gz

    
    ~ scp ubuntu-12.04-server-cloudimg-i386.tar.gz openstack@192.168.1.200:/home/openstack
    ubuntu-12.04-server-cloudimg-i386.tar.gz                                              100%  206MB -11880.-5KB/s   00:07
    
    ~ ls -l
    -rw------- 1 openstack openstack      1029 Jul 14 21:22 cacert.pem
    -rw------- 1 openstack openstack      2515 Jul 14 21:22 cert.pem
    -rw------- 1 openstack openstack      1113 Jul 14 21:22 novarc
    -rw-r--r-- 1 root      root           5930 Jul 14 21:22 nova.zip
    -rw------- 1 openstack openstack       954 Jul 14 21:23 openstack.pem
    -rw------- 1 openstack openstack       887 Jul 14 21:22 pk.pem
    -rw-r--r-- 1 openstack openstack 215487341 Jul 14 21:25 ubuntu-12.04-server-cloudimg-i386.tar.gz
    

    安装云实例

    
    ~ cloud-publish-tarball ubuntu-12.04-server-cloudimg-i386.tar.gz images i386
    
    Sun Jul 14 21:25:34 CST 2013: ====== extracting image ======
    Warning: no ramdisk found, assuming '--ramdisk none'
    kernel : precise-server-cloudimg-i386-vmlinuz-virtual
    ramdisk: none
    image  : precise-server-cloudimg-i386.img
    Sun Jul 14 21:25:40 CST 2013: ====== bundle/upload kernel ======
    Sun Jul 14 21:26:06 CST 2013: ====== bundle/upload image ======
    Sun Jul 14 21:27:04 CST 2013: ====== done ======
    emi="ami-00000002"; eri="none"; eki="aki-00000001";
    

    查看云实例
    两种查看方式:euca客户端和nova客户端

    注:注册image的操作要经过:decrypting,untarring,uploading,available这几个状态,需要等待几分钟

    
    ~ euca-describe-images
    
    IMAGE   aki-00000001    images/precise-server-cloudimg-i386-vmlinuz-virtual.manifest.xmlavailable        private         i386    kernel                          instance-store
    IMAGE   ami-00000002    images/precise-server-cloudimg-i386.img.manifest.xml            available        private         i386    machine aki-00000001                    instance-store
    
    ~ nova image-list
    +--------------------------------------+-----------------------------------------------------+--------+--------+
    |                  ID                  |                         Name                        | Status | Server |
    +--------------------------------------+-----------------------------------------------------+--------+--------+
    | 306eb471-bbc5-495e-b7a1-484e11f71502 | images/precise-server-cloudimg-i386-vmlinuz-virtual | ACTIVE |        |
    | 9dbf632e-b0d8-4230-a0e5-ee3836040492 | images/precise-server-cloudimg-i386.img             | ACTIVE |        |
    +--------------------------------------+-----------------------------------------------------+--------+--------+
    

    设置云实例网络

    
    ~ euca-authorize default -P tcp -p 22 -s 0.0.0.0/0
    GROUP   default
    PERMISSION      default ALLOWS  tcp     22      22      FROM    CIDR    0.0.0.0/0
    
    ~ euca-authorize default -P icmp -t -1:-1
    GROUP   default
    PERMISSION      default ALLOWS  icmp    -1      -1      FROM    CIDR    0.0.0.0/0
    

    查看系统空间

    
    ~ df -h
    Filesystem      Size  Used Avail Use% Mounted on
    /dev/sda1        34G  3.7G   29G  12% /
    udev            3.0G  4.0K  3.0G   1% /dev
    tmpfs           1.2G  316K  1.2G   1% /run
    none            5.0M     0  5.0M   0% /run/lock
    none            3.0G     0  3.0G   0% /run/shm
    cgroup          3.0G     0  3.0G   0% /sys/fs/cgroup
    
    ~ nova flavor-list
    +----+-----------+-----------+------+-----------+------+-------+-------------+
    | ID |    Name   | Memory_MB | Disk | Ephemeral | Swap | VCPUs | RXTX_Factor |
    +----+-----------+-----------+------+-----------+------+-------+-------------+
    | 1  | m1.tiny   | 512       | 0    | 0         |      | 1     | 1.0         |
    | 2  | m1.small  | 2048      | 10   | 20        |      | 1     | 1.0         |
    | 3  | m1.medium | 4096      | 10   | 40        |      | 2     | 1.0         |
    | 4  | m1.large  | 8192      | 10   | 80        |      | 4     | 1.0         |
    | 5  | m1.xlarge | 16384     | 10   | 160       |      | 8     | 1.0         |
    +----+-----------+-----------+------+-----------+------+-------+-------------+
    

    还有29G硬盘空间,根据云实例创建favor-list,cpcp彩票我 可以选择 tiny或者small。

    这里cpcp彩票我 选择用tiny模式

    
    ~ euca-run-instances ami-00000002 -t m1.tiny -k openstack
    RESERVATION     r-f6y5tydu      cookbook        default
    INSTANCE        i-00000001      ami-00000002                    pending openstack (cookbook, None)       0               m1.tiny 2013-07-14T13:30:02.000Z        unknown zone    aki-00000001                     monitoring-disabled     
    

    查看云实例列表
    这个操作要等几分钟:

    
    ~ euca-describe-instances
    RESERVATION     r-f6y5tydu      cookbook        default
    INSTANCE        i-00000001      ami-00000002    172.16.1.1      10.0.0.3        running openstack (cookbook, nova)       0               m1.tiny 2013-07-14T13:30:02.000Z        nova     aki-00000001                    monitoring-disabled     172.16.1.1      10.0.0.3instance-store
    
    ~ nova list
    +--------------------------------------+----------+--------+----------------------------+
    |                  ID                  |   Name   | Status |          Networks          |
    +--------------------------------------+----------+--------+----------------------------+
    | d6e5fe88-1950-48f4-853a-2fd57e6c72f4 | Server 1 | ACTIVE | vmnet=10.0.0.3, 172.16.1.1 |
    +--------------------------------------+----------+--------+----------------------------+
    
    ~ top
    11424 libvirt-  20   0 1451m 322m 7284 S  105  5.4   1:20.58 qemu-system-x86
     8962 nova      20   0  265m 104m 5936 S    3  1.8   0:17.57 nova-api
       35 root      25   5     0    0    0 S    1  0.0   0:00.66 ksmd
     5933 rabbitmq  20   0 2086m  28m 2468 S    1  0.5   0:03.33 beam.smp
     8969 nova      20   0  195m  48m 4748 S    1  0.8   0:02.77 nova-scheduler
     9190 nova      20   0 1642m  63m 6660 S    1  1.1   0:08.14 nova-compute
     8522 mysql     20   0 1118m  54m 8276 S    1  0.9   0:04.45 mysqld
     8951 nova      20   0  197m  50m 4756 S    1  0.9   0:03.09 nova-network
    

    7. 登陆云实例

    ping通过云实例

    
    ~ ping 172.16.1.1
    PING 172.16.1.1 (172.16.1.1) 56(84) bytes of data.
    64 bytes from 172.16.1.1: icmp_req=1 ttl=64 time=5.98 ms
    64 bytes from 172.16.1.1: icmp_req=2 ttl=64 time=2.00 ms
    64 bytes from 172.16.1.1: icmp_req=3 ttl=64 time=3.27 ms
    

    通过证书登陆云实例

    
    ~ ssh -i openstack.pem ubuntu@172.16.1.1
    The authenticity of host '172.16.1.1 (172.16.1.1)' can't be established.
    ECDSA key fingerprint is b8:0b:a6:18:0d:30:06:ea:79:c7:17:e5:29:34:55:39.
    Are you sure you want to continue connecting (yes/no)? yes
    Warning: Permanently added '172.16.1.1' (ECDSA) to the list of known hosts.
    

    在云实例简单操作

    
    ~ ubuntu@server-1:~$ who
    ubuntu   pts/0        2013-07-14 13:35 (172.16.1.1)
    
    ~ ubuntu@server-1:~$ ifconfig
    eth0      Link encap:Ethernet  HWaddr fa:16:3e:22:75:8f
              inet addr:10.0.0.3  Bcast:10.0.0.63  Mask:255.255.255.192
              inet6 addr: fe80::f816:3eff:fe22:758f/64 Scope:Link
              UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
              RX packets:263 errors:0 dropped:0 overruns:0 frame:0
              TX packets:250 errors:0 dropped:0 overruns:0 carrier:0
              collisions:0 txqueuelen:1000
              RX bytes:29823 (29.8 KB)  TX bytes:28061 (28.0 KB)
    
    lo        Link encap:Local Loopback
              inet addr:127.0.0.1  Mask:255.0.0.0
              inet6 addr: ::1/128 Scope:Host
              UP LOOPBACK RUNNING  MTU:16436  Metric:1
              RX packets:0 errors:0 dropped:0 overruns:0 frame:0
              TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
              collisions:0 txqueuelen:0
              RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)
    

    nova实验安装完成!!

    8. 错误总结

    看上去上面过程,就是命令操作,但其实会遇到很多的问题。

    1. 版本问题:如果用Ubunut 12.04.2时,和上面完全一样的操作,在创建云实例的时候会失败。云实例起不来,从RUNNING状态直接进入SHUTDOWN状态。ping不通,ssh也连不上。

    2. 版本问题:如果用Ubunut 13.03时,命令操作,Nova数据库,Nova配置文件, Novacpcp彩票服务 都已经和书中实例不一样了。按照书的操作无法进行。

    3. 依赖cpcp彩票软件 问题:pm-utils这个cpcp彩票软件 竟然没有做libvirtd的依赖,需要手动安装。
    如果没有安装会出现下面的错误

    
    ~ sudo cat /var/log/libvirt/libvirtd.log
    
    2013-07-13 12:20:25.511+0000: 9292: info : libvirt version: 0.9.8
    2013-07-13 12:20:25.511+0000: 9292: error : virExecWithHook:327 : Cannot find 'pm-is-supported' in path: No such file or directory
    2013-07-13 12:20:25.511+0000: 9292: warning : qemuCapsInit:856 : Failed to get host power management capabilities
    2013-07-13 12:20:25.653+0000: 9292: error : virExecWithHook:327 : Cannot find 'pm-is-supported' in path: No such file or directory
    2013-07-13 12:20:25.653+0000: 9292: warning : lxcCapsInit:77 : Failed to get host power management capabilities
    2013-07-13 12:20:25.654+0000: 9292: error : virExecWithHook:327 : Cannot find 'pm-is-supported' in path: No such file or directory
    2013-07-13 12:20:25.655+0000: 9292: warning : umlCapsInit:87 : Failed to get host power management capabilities
    
    #解决pm-is-supported错误
    ~ sudo apt-get -y install pm-utils
    ~ sudo stop libvirt-bin
    ~ sudo start libvirt-bin
    

    4. nova状态问题:注册image时间,会卡在untarring永久不动了。nova image-list状态是saving。
    这是因为image由于各种原因没有注册成功,cpcp彩票我 第一次就由于硬盘满了,就一直卡在这个状态了,也看不到错误信息太郁闷了。

    解决办法:cpcp彩票删除 原来卡住的 image,重新注册。

    
    euca-deregister aki-00000001
    euca-deregister ami-00000002
    

    耐心很重要,坚持就是胜利。

    转载请注明出处:
    http://whgmhg.com/vps-nova-setup/

    打赏作者

    Cassandra单集群实验2个节点

    cassandra-title

    前言

    Apache Cassandra是一套开源分布式Key-Value存储系统。它最初由Facebook开发,用于储存特别大的数据。主要特性:分布式,基于column的结构化,高伸展性。作为NoSQL的一支代表,虽然现在已经被hbase超越,但Cassandra的很多的设计思想是非常值得cpcp彩票cpcp彩票我 们 学习和借鉴的。感谢tigerfish老师的详细讲解,让cpcp彩票我 收获颇多!

    Cassandra中非常有用的几个概念:一致性哈希,Gossip协议,Snitch,复制策略,DHT,BloomFilter。

    cpcp彩票关于 作者:
    张丹(Conan), 程序员Java,R,PHP,Javascript
    weibo:@Conan_Z
    blog: http://whgmhg.com
    email: bsspirit@gmail.com

    转载请注明出处:
    http://whgmhg.com/cassandra-clustor/

    目录

    1. Cassandra集群实验2个节点
    2. 实验过程的错误及修复

    1. Cassandra集群实验2个节点

    1. cpcp彩票下载 Cassandra并配置JAVA环境(跳过)
    2. 安装第一个Cassandra节点,解压到/home/conan/toolkit/cassandra125目录

    ~ pwd
    /home/conan/toolkit/cassandra125
    

    ipcpcp彩票地址 :192.168.1.200

    
    ~ ifconfig
    eth0      Link encap:Ethernet  HWaddr 08:00:27:90:e8:19
          inet addr:192.168.1.200  Bcast:192.168.1.255  Mask:255.255.255.0
          inet6 addr: fe80::a00:27ff:fe90:e819/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:16943 errors:0 dropped:0 overruns:0 frame:0
          TX packets:19527 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:1433046 (1.4 MB)  TX bytes:2902059 (2.9 MB)
    

    3. 设置环境变更

    
    ~ sudo vi /etc/environment
    CASSANDRA_HOME=/home/conan/toolkit/cassandra125
    
    ~ . /etc/environment
    
    ~ export |grep /home/conan/toolkit/cassandra125
    declare -x CASSANDRA_HOME="/home/conan/toolkit/cassandra125"
    declare -x OLDPWD="/home/conan/toolkit/cassandra125"
    declare -x PWD="/home/conan/toolkit/cassandra125/bin"
    

    4. 创建存储和日志目录

    
    ~ sudo rm -rf /var/lib/cassandra
    
    ~ sudo mkdir -p /var/lib/cassandra/data
    ~ sudo mkdir -p /var/lib/cassandra/saved_caches
    ~ sudo mkdir -p /var/lib/cassandra/commitlog
    ~ sudo mkdir -p /var/log/cassandra/
    
    ~ sudo chown -R conan:conan /var/lib/cassandra
    ~ sudo chown -R conan:conan /var/log/cassandra
    
    ~ ls -l /var/lib/cassandra
    drwxr-xr-x  2 conan conan 4096 Jul  4 00:15 commitlog/
    drwxr-xr-x  2 conan conan 4096 Jul  4 00:15 data/
    drwxr-xr-x  2 conan conan 4096 Jul  4 00:15 saved_caches/
    

    5. 修改配置文件cassandra.yaml,按文件顺序列表修改的地方

    
    ~ vi /home/conan/toolkit/cassandra125/conf/cassandra.yaml
    
    cluster_name: 'case1'
    num_tokens: 256
    
    data_file_directories:
    - /var/lib/cassandra/data
    
    commitlog_directory: /var/lib/cassandra/commitlog
    
    saved_caches_directory: /var/lib/cassandra/saved_caches
    
    seed_provider:
    - class_name: org.apache.cassandra.locator.SimpleSeedProvider
      parameters:
          - seeds: "192.168.1.200"
    
    #listen_address: localhost
    listen_address: 192.168.1.200
    
    #rpc_address: localhost
    rpc_address: 192.168.1.200
    
    endpoint_snitch: SimpleSnitch
    

    6. 启动节点

    
    ~ cd /home/conan/toolkit/cassandra125/
    
    ~ bin/cassandra -f
    
    #部分日志
    INFO 00:23:22,785 Enqueuing flush of Memtable-schema_columnfamilies@1792194126(1097/1097 serialized/live bytes, 20 ops)
    INFO 00:23:22,786 Writing Memtable-schema_columnfamilies@1792194126(1097/1097 serialized/live bytes, 20 ops)
    INFO 00:23:22,796 Completed flushing /var/lib/cassandra/data/system/schema_columnfamilies/system-schema_columnfamilies-ic-2-Data.db (698 bytes) for commitlog position ReplayPosition(segmentId=1372868601408, position=64705)
    INFO 00:23:22,797 Enqueuing flush of Memtable-schema_columns@552364977(251/251 serialized/live bytes, 5 ops)
    INFO 00:23:22,798 Writing Memtable-schema_columns@552364977(251/251 serialized/live bytes, 5 ops)
    INFO 00:23:22,808 Completed flushing /var/lib/cassandra/data/system/schema_columns/system-schema_columns-ic-2-Data.db (209 bytes) for commitlog position ReplayPosition(segmentId=1372868601408, position=64705)
    INFO 00:23:22,894 Starting listening for CQL clients on /192.168.1.200:9042...
    INFO 00:23:22,906 Binding thrift service to /192.168.1.200:9160
    INFO 00:23:22,931 Using TFramedTransport with a max frame size of 15728640 bytes.
    INFO 00:23:22,952 Using synchronous/threadpool thrift server on 192.168.1.200 : 9160
    INFO 00:23:22,953 Listening for thrift clients...
    INFO 00:23:33,101 Created default superuser 'cassandra'
    

    7. 查看集群的状态

    
    bin/nodetool status
    Datacenter: datacenter1
    =======================
    Status=Up/Down
    |/ State=Normal/Leaving/Joining/Moving
    --  Address        Load       Tokens  Owns (effective)  Host ID                               Rack
    UN  192.168.1.200  51.01 KB   256     100.0%            e7106e0a-1a9e-43a2-9bcc-fc1201076fee  rack1
    

    8. 增加第2个节点到集群:2个节点
    计算机ip: 192.168.1.201

    
    ~ ifconfig
    eth0      Link encap:Ethernet  HWaddr 08:00:27:0d:0b:0b
              inet addr:192.168.1.201  Bcast:192.168.1.255  Mask:255.255.255.0
              inet6 addr: fe80::a00:27ff:fe0d:b0b/64 Scope:Link
              UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
              RX packets:45455 errors:0 dropped:0 overruns:0 frame:0
              TX packets:14717 errors:0 dropped:0 overruns:0 carrier:0
              collisions:0 txqueuelen:1000
              RX bytes:33590582 (33.5 MB)  TX bytes:2549931 (2.5 MB)
    

    9. 从第一节点192.168.1.200把cassandra125和jdk目录复制过来

    
    ~ pwd
    /home/conan/toolkit
    
    ~ scp -r conan@192.168.1.200:/home/conan/toolkit/cassandra125 .
    ~ scp -r conan@192.168.1.200:/home/conan/toolkit/jdk16 .
    
    ~ ls -l
    drwxrwxr-x  9 conan conan 4096 Apr 25 03:04 cassandra125
    drwxr-xr-x 10 conan conan 4096 Apr 25 03:33 jdk16
    

    10. 设置环境变量

    
    ~ sudo vi /etc/environment
    
    PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/home/conan/toolkit/jdk16/bin:/home/conan/toolkit/cassandra/bin"
    JAVA_HOME=/home/conan/toolkit/jdk16
    CASSANDRA_HOME=/home/conan/toolkit/cassandra125
    
    ~ . /etc/environment
    
    ~ java -version
    java version "1.6.0_29"
    Java(TM) SE Runtime Environment (build 1.6.0_29-b11)
    Java HotSpot(TM) 64-Bit Server VM (build 20.4-b02, mixed mode)
    

    11. 创建存储和日志目录

    
    ~ sudo rm -rf /var/lib/cassandra
    
    ~ sudo mkdir -p /var/lib/cassandra/data
    ~ sudo mkdir -p /var/lib/cassandra/saved_caches
    ~ sudo mkdir -p /var/lib/cassandra/commitlog
    ~ sudo mkdir -p /var/log/cassandra/
    
    ~ sudo chown -R conan:conan /var/lib/cassandra
    ~ sudo chown -R conan:conan /var/log/cassandra
    
    ~ ls -l /var/lib/cassandra
    drwxr-xr-x  2 conan conan 4096 Jul  4 00:15 commitlog/
    drwxr-xr-x  2 conan conan 4096 Jul  4 00:15 data/
    drwxr-xr-x  2 conan conan 4096 Jul  4 00:15 saved_caches/
    

    12. 修改配置文件cassandra.yaml,按文件顺序列表修改的地方

    
    ~ vi /home/conan/toolkit/cassandra125/conf/cassandra.yaml
    
    cluster_name: 'case1'
    
    seed_provider:
    - class_name: org.apache.cassandra.locator.SimpleSeedProvider
      parameters:
          - seeds: "192.168.1.200"
    
    listen_address: 192.168.1.201
    
    rpc_address: 192.168.1.201
    

    13. 启动节点192.168.1.201

    
    ~ bin/cassandra -f
    
    //部分日志
    INFO 03:36:47,476 Completed flushing /var/lib/cassandra/data/system/local/system-local-ic-4-Data.db (75 bytes) for commitlog position ReplayPosition(segmentId=1366832174115, position=77582)
    INFO 03:36:47,504 Compacting [SSTableReader(path='/var/lib/cassandra/data/system/local/system-local-ic-1-Data.db'), SSTableReader(path='/var/lib/cassandra/data/system/local/system-local-ic-3-Data.db'), SSTableReader(path='/var/lib/cassandra/data/system/local/system-local-ic-4-Data.db'), SSTableReader(path='/var/lib/cassandra/data/system/local/system-local-ic-2-Data.db')]
    INFO 03:36:47,527 Enqueuing flush of Memtable-local@692438881(10094/10094 serialized/live bytes, 257 ops)
    INFO 03:36:47,533 Writing Memtable-local@692438881(10094/10094 serialized/live bytes, 257 ops)
    INFO 03:36:47,547 Completed flushing /var/lib/cassandra/data/system/local/system-local-ic-5-Data.db (5365 bytes) for commitlog position ReplayPosition(segmentId=1366832174115, position=89585)
    INFO 03:36:47,660 Node /192.168.1.201 state jump to normal
    INFO 03:36:47,663 Startup completed! Now serving reads.
    INFO 03:36:47,686 Compacted 4 sstables to [/var/lib/cassandra/data/system/local/system-local-ic-6,].  5,956 bytes to 5,687 (~95% of original) in 158ms = 0.034326MB/s.  4 total rows, 1 unique.  Row merge counts were {1:0, 2:0, 3:0, 4:1, }
    INFO 03:36:47,771 Starting listening for CQL clients on /192.168.1.201:9042...
    INFO 03:36:47,785 Binding thrift service to /192.168.1.201:9160
    INFO 03:36:47,810 Using TFramedTransport with a max frame size of 15728640 bytes.
    INFO 03:36:47,834 Using synchronous/threadpool thrift server on 192.168.1.201 : 9160
    INFO 03:36:47,834 Listening for thrift clients...
    

    14. 查看节点1,192.168.1.200的日志

    
    INFO 01:01:27,382 InetAddress /192.168.1.201 is now UP
    INFO 01:01:58,660 Beginning transfer to /192.168.1.201
    INFO 01:01:58,661 Flushing memtables for [CFS(Keyspace='system_auth', ColumnFamily='users')]...
    INFO 01:01:58,663 Enqueuing flush of Memtable-users@1338035062(28/28 serialized/live bytes, 2 ops)
    INFO 01:01:58,668 Writing Memtable-users@1338035062(28/28 serialized/live bytes, 2 ops)
    INFO 01:01:59,010 Completed flushing /var/lib/cassandra/data/system_auth/users/system_auth-users-ic-1-Data.db (64 bytes) for commitlog position ReplayPosition(segmentId=1372868601408, position=65900)
    INFO 01:01:59,047 Stream context metadata [/var/lib/cassandra/data/system_auth/users/system_auth-users-ic-1-Data.db sections=1 progress=0/64 - 0%], 1 sstables.
    INFO 01:01:59,048 Streaming to /192.168.1.201
    INFO 01:01:59,122 Successfully sent /var/lib/cassandra/data/system_auth/users/system_auth-users-ic-1-Data.db to /192.168.1.201
    INFO 01:01:59,123 Finished streaming session to /192.168.1.201
    INFO 01:01:59,424 Enqueuing flush of Memtable-peers@1855686378(10279/10279 serialized/live bytes, 271 ops)
    INFO 01:01:59,425 Writing Memtable-peers@1855686378(10279/10279 serialized/live bytes, 271 ops)
    INFO 01:01:59,497 Completed flushing /var/lib/cassandra/data/system/peers/system-peers-ic-1-Data.db (5538 bytes) for commitlog position ReplayPosition(segmentId=1372868601408, position=77902)
    

    15. 集群中已经成功加载了192.168.1.201个节点了。

    
    bin/nodetool status
    
    Datacenter: datacenter1
    =======================
    Status=Up/Down
    |/ State=Normal/Leaving/Joining/Moving
    --  Address        Load       Tokens  Owns (effective)  Host ID                               Rack
    UN  192.168.1.200  65.46 KB   256     48.7%             e7106e0a-1a9e-43a2-9bcc-fc1201076fee  rack1
    UN  192.168.1.201  58.04 KB   256     51.3%             8eef1965-9822-44bf-a9f6-fff5b87bc474  rack1
    

    实验完成!!

    2. 实验过程的错误及修复

    1. 不要在已经建立keystore的节点,再修改cluster name
    出现下面的错误

    
    ERROR 23:29:58,004 Fatal exception during initialization
    org.apache.cassandra.exceptions.ConfigurationException: Saved cluster name Test Cluster != configured name case1
    

    解决办法:http://wiki.apache.org/cassandra/FAQ

    
    Cassandra says "ClusterName mismatch: oldClusterName != newClusterName" and refuses to start
    
    To prevent operator errors, Cassandra stores the name of the cluster in its system table. If you need to rename a cluster for some reason, you can:
    
    Perform these steps on each node:
    
    Start the cassandra-cli connected locally to this node.
    Run the follocpcp彩票Win
    g:
    use system;
    set LocationInfo[utf8('L')][utf8('ClusterName')]=utf8('');
    exit;
    Run nodetool flush on this node.
    Update the cassandra.yaml file for the cluster_name as the same as 2b).
    Restart the node.
    Once all nodes have been had this operation performed and restarted, nodetool ring should show all nodes as UP.
    

    2. 修改配置文件时:后一定要有空格
    会出现下面的错误提示:

    
    while scanning a simple key; could not found expected ':'
    

    举例:修改下面配置

    
    #错误语法
    listen_address:localhost
    
    #正确语法
    listen_address: localhost
    

    问题解释:
    上面问题是由于,:和”之间没有空格,引起的解析错误。

    转载请注明出处:
    http://whgmhg.com/cassandra-clustor/

    打赏作者

    Datagurucpcp彩票北京 线下聚会圆满成功

    跨界知识聚会系列文章,“知识是用来分享和传承的”,各种会议、cpcp彩票论坛 、沙龙都是分享知识的绝佳场所。cpcp彩票我 也有幸作为演讲嘉宾参加了一些国内的大型会议,向大家展示cpcp彩票我 所做的一些成果。从听众到演讲感觉是不一样的,把知识分享出来,cpcp彩票你 才能收获cpcp彩票更多 。

    cpcp彩票关于 作者

    • 张丹(Conan), 程序员Java,R,PHP,Javascript
    • weibo:@Conan_Z
    • blog: http://whgmhg.com
    • email: bsspirit@gmail.com

    转载请注明出处:
    http://whgmhg.com/dataguru-beijing-meeting-20130616/

    title

    嘉宾PPT分享cpcp彩票下载 :

    感谢活动团队:

    • 发起人、cpcp彩票组织 者:张丹
    • 报名统计:于双海,何青
    • 场地赞助:何青
    • 通         知:张丹 
    • 主   持  人:李阳
    • 维持秩序 :于双海,何青
    • 摄   影  师:张丹,于双海

    01

     

    参加活动的同学:

    姓名 cpcp彩票公司 dataguruID 参加课程
    张丹
    bsspirit R, R展现,SAS,Hadoop, NoSQL,Oracle,Openstack, Kettle
    李阳 中科辅龙 casliyang hadoop1
    于双海  博彦科技 sunev_yu hadoop1
    何青 微课网 heqingcool hadoop1
    张宣彬 Oracle linou hadoop1
    刘盛 金融机构 leonarding oracle系列
    王非  当当网
    阮宏博  cpcp彩票浙江 和仁科技有限cpcp彩票公司 976073363@qq.com
    李雪峰  cpcp彩票浙江 和仁科技有限cpcp彩票公司
    丁波  zejia 独角兽老头 R1
    董红磊 董红磊
    梁胜和  oracle海量架构、性能cpcp彩票优化 、SAS、R
    邓奕 板凳总 Hadoop、NoSQL
    马光东  亚信联创 27112 oracle深入课程

     

    02

    活动现场cpcp彩票视频 :

    自cpcp彩票我 介绍

    Nginx fastdfs图片系统–王非

    大数据解决方案之基于用户位置分析的应用 – 张宣彬

    Oracle 索引cpcp彩票优化 思路–案例分享 — 刘盛

    数据分析平台规划–课程贯穿 — 张丹

    转载请注明出处:
    http://whgmhg.com/dataguru-beijing-meeting-20130616/

    打赏作者