'Articles'에 해당되는 글 116건

  1. 2013/10/22 용비 리눅스에서 Crontab 사용하기
  2. 2013/09/10 용비 Flume NG + Hadoop Log Collector Setting
  3. 2013/09/09 용비 Hadoop Clustering
  4. 2013/08/22 용비 Embedded Jetty + Jersey Restful Setting
  5. 2013/08/11 용비 Maven을 쓸 수 밖에 없는 이유
hadoop User가 Crontab을 실행할 수 있도록 설정

1. /etc/cron.allow에 user 추가.
vi cron.allow
hadoop

2. Crontab 설정 파일
crontab -u hadoop -e
0 * * * * shell_script_absolute_path

3. Crontab Restart
/etc/init.d/crond restart

4. Shell Script 작성
source ~/.bash_profile (Profile의 PATH 정보를 Update하기 위함)
hadoop jar ......

5. Crontab Log
cat /var/log/cron

6. 오류 메시지
User account has expired.
You (hadoop) are not allowed to access to (crontab) because of pam configuration.
chage -E mm/dd/yy {userId}
받은 트랙백이 없고, 댓글이 없습니다.

댓글+트랙백 RSS :: http://www.yongbi.net/rss/response/581

[Flume Agent]

vi flume.conf

#########################################################
#### Configure for Flume Agent (Gateway Log Agent)
#########################################################

agentA.sources = gwSource
agentA.channels = gwChannel
agentA.sinks = gwSink

##########################################
#### Configure for Source
##########################################

# For each one of the sources, the type is defined
agentA.sources.gwSource.type = exec
agentA.sources.gwSource.command = tail -Fs 180 /home/yusbha/nginx/logs/access.log
agentA.sources.gwSource.restart = true
agentA.sources.gwSource.restartThrottle=1000
agentA.sources.gwSource.interceptors = i1
agentA.sources.gwSource.interceptors.i1.type = timestamp
agentA.sources.gwSource.channels = gwChannel


##########################################
#### Configure for Channel
##########################################

# Each channel's type is defined.
agentA.channels.gwChannel.type = file
agentA.channels.gwChannel.dataDirs = /home/yusbha/flume/file-channel/data/01
agentA.channels.gwChannel.checkpointDir = /home/yusbha/flume/file-channel/checkpoint/01
agentA.channels.gwChannel.maxFileSize=524288000
agentA.channels.gwChannel.checkpointInterval=10000
agentA.channels.gwChannel.transactionCapacity=1000

##########################################
#### Configure for Sink
##########################################
# Each sink's type must be defined
agentA.sinks.gwSink.type = avro
agentA.sinks.gwSink.hostname = 172.27.106.48
agentA.sinks.gwSink.port = 35853
agentA.sinks.gwSink.channel = gwChannel

Flume Agent 실행 Command

bin/flume-ng agent --conf conf --conf-file conf/flume.conf --name agentA -Dflume.root.logger=INFO,console

[Flume Collector]

vi flume.conf

#########################################################
#### Configure for Flume Agent (Gateway Log Collector)
#########################################################

collector.sources = collectorSource
collector.channels = collectorChannel
collector.sinks = HDFS

##########################################
#### Configure for Source : 172.27.106.48
##########################################

# For each one of the sources, the type is defined.
collector.sources.collectorSource.type = avro
collector.sources.collectorSource.bind = 0.0.0.0
collector.sources.collectorSource.port = 35853
collector.sources.collectorSource.channels = collectorChannel

##########################################
#### Configure for Channel
##########################################
collector.channels.collectorChannel.type = memory

#collector.channels.collectorChannel.type = file
#collector.channels.collectorChannel.dataDirs = /home/hadoop/flume/file-channel/data/01
#collector.channels.collectorChannel.checkpointDir = /home/hadoop/flume/file-channel/checkpoint/01
#collector.channels.collectorChannel.transactionCapacity = 1000
#collector.channels.collectorChannel.checkpointInterval = 30000
#collector.channels.collectorChannel.maxFileSize = 2146435071
#collector.channels.collectorChannel.minimumRequiredSpace = 524288000
#collector.channels.collectorChannel.keep-alive = 5
#collector.channels.collectorChannel.write-timeout = 10
#collector.channels.collectorChannel.checkpoint-timeout = 600
#collector.channels.collectorChannel.capacity = 500000

##########################################
#### Configure for Sink
##########################################
# Each sink's type must be defined
collector.sinks.HDFS.type = hdfs
collector.sinks.HDFS.hdfs.path = hdfs://name.odp.kt.com/logs
collector.sinks.HDFS.hdfs.filePrefix = %Y%m%d%H%M%S
collector.sinks.HDFS.hdfs.fileType = DataStream
collector.sinks.HDFS.hdfs.fileSuffix = .log
collector.sinks.HDFS.hdfs.inUseSuffix = .work

collector.sinks.HDFS.hdfs.maxOpenFiles = 200
collector.sinks.HDFS.hdfs.rollSize = 0
collector.sinks.HDFS.hdfs.rollInterval = 60
collector.sinks.HDFS.hdfs.rollCount = 0
collector.sinks.HDFS.hdfs.rollTimerPoolSize = 1
collector.sinks.HDFS.hdfs.batchSize = 100
collector.sinks.HDFS.hdfs.threadsPoolSize = 1
collector.sinks.HDFS.hdfs.callTimeout = 60000

collector.sinks.HDFS.hdfs.writeFormat = TEXT
collector.sinks.HDFS.serializer = text
collector.sinks.HDFS.serializer.appendNewline = true
collector.sinks.HDFS.channel = collectorChannel

Flume Agent 실행 Command

bin/flume-ng agent --conf conf --conf-file conf/flume.conf --name collector -Dflume.root.logger=INFO,console
받은 트랙백이 없고, 댓글이 없습니다.

댓글+트랙백 RSS :: http://www.yongbi.net/rss/response/580

Hadoop Clustering

Articles 2013/09/09 18:05 용비
[공통]
OS : CentOS 6.4 64-bit
Java : OpenJDK 1.7
Hadoop : 1.2.1

1. 설치 서버
- Name Node : 172.27.106.48 (name.odp.kt.com)
- Data Node : 172.27.233.144 (data01.odp.kt.com)

2. OpenJDK 설치
- 실행 계정 : root
- 적용 서버 : 전체 서버 (name node, data node)
yum -y install *openJDK*

3. Hadoop 계정 추가
- 실행 계정 : root
- 적용 서버 : 전체 서버 (name node, data node)
groupadd hadoop
useradd -g hadoop hadoop
passwd hadoop

4. Host 설정
- 실행 계정 : root
- 적용 서버 : 전체 서버 (name node, data node)
vi /etc/hosts (hosts 파일 수정)
== 하단에 내용 추가
172.27.106.48 name.odp.kt.com
172.27.233.144 data01.odp.kt.com

5. 방화벽 설정
- 실행 계정 : root
- 적용 서버 : 전체
service iptables stop
chkconfig iptables off

[NameNode 서버 설정]
1. 데이터 디렉토리 생성
- 실행 계정 : hadoop
- 적용 서버 : name.odp.kt.com
mkdir $HOME/data
mkdir $HOME/data/name

2. SSH 접근 제어 설정
- 실행 계정 : hadoop
- 적용 서버 : name.odp.kt.com
== SSH 키 생성
ssh-keygen -t rsa

== SSH 키 배포
ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop@data01.odp.kt.com

3. Hadoop 설치
- 실행 계정 : hadoop
- 적용 서버 : name.odp.kt.com
- Hadoop Version : 1.1.2, 1.2.1
tar xvf hadoop-1.x.x-tar.gz

4. Hadoop 환경 설정
- 실행 계정 : hadoop
- 적용 서버 : name.odp.kt.com
vi hadoop-env.sh

export HADOOP_HOME=/home/hadoop/hadoop-1.2.1
export HADOOP_HOME_WARN_SUPPRESS="TRUE"
 
# export JAVA_HOME=/usr/lib/jvm/jre-1.6.0-openjdk.x86_64
export JAVA_HOME=/usr/lib/jvm/jre-1.7.0-openjdk.x86_64
export HADOOP_OPTS=-server

vi core-site.xml

<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://name.odp.kt.com:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/tmp/hadoop-${user.name}</value>
</property>
</configuration>

vi hdfs-site.xml

<configuration>
<property>
<name>dfs.name.dir</name>
<value>/home/hadoop/data/name,/home/hadoop/data/backup</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/home/hadoop/data/node01,/home/hadoop/data/node02</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>fs.trash.interval</name>
<value>30</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.support.append</name>
<value>true</value>
</property>
<property>
<name>dfs.support.broken.append</name>
<value>true</value>
</property>
<property>
<name>dfs.permissions</name>
<value>true</value>
</property>
<property>
<name>dfs.web.ugi</name>
<value>hadoop,supergroup</value>
</property>
<property>
<name>dfs.permissions.supergroup</name>
<value>supergroup</value>
</property>
<property>
<name>dfs.upgrade.permission</name>
<value>0777</value>
</property>
<property>
<name>dfs.umaskmode</name>
<value>022</value>
</property>
<property>
<name>dfs.http.address</name>
<value>name.odp.kt.com:50070</value>
</property>
</configuration>

vi mapred-site.xml

<configuration>
<property>
<name>mapred.job.tracker</name>
<value>hdfs://name.odp.kt.com:9001</value>
</property>
<property>
<name>mapred.system.dir</name>
<value>/home/hadoop/data/mapred/system</value>
</property>
<property>
<name>mapred.local.dir</name>
<value>/home/hadoop/data/mapred/local</value>
</property>
</configuration>

vi conf/masters <= 수정 사항 없음. 일반적으로 secondary name node 정보 setup.

vi conf/slaves
== 아래 내용 추가
data01.odp.kt.com

Hadoop 설치 폴더 배포
scp -r /home/hadoop/hadoop-1.2.1 data01.odp.kt.com:/home/hadoop/hadoop-1.2.1

환경설정 배포
rsync -av /home/hadoop/hadoop-1.2.1/conf hadoop@data01.odp.kt.com:/home/hadoop/hadoop-1.2.1

Hadoop 실행
== NameNode 포맷
./hadoop namenode -format

== Hadoop 시작
./start-all.sh

== Hadoop Console 확인
./hadoop dfsadmin -report

== Hadoop 종료
./stop-all.sh

[DataNode 서버 설정]
- 실행 계정 : hadoop
- 적용 서버 : data01.odp.kt.com
mkdir $HOME/data
mkdir $HOME/data/node01
mkdir $HOME/data/node02

받은 트랙백이 없고, 댓글이 없습니다.

댓글+트랙백 RSS :: http://www.yongbi.net/rss/response/579

<Jetty Version : 9.0.4>
필수 Jetty Library : jetty-servlet, servlet-api (Maven으로 의존관계의 Library 관리)
private static final String JERSEY_RESOURCES = "리소스패키지(ex. net.yongbi.rest)";

Server server = new Server(port);
ServletContextHandler context = new ServletContextHandler(ServletContextHandler.SESSIONS);
context.setContextPath("/");
server.setHandler(context);
ServletHolder h = new ServletHolder(new ServletContainer());
h.setInitParameter("com.sun.jersey.config.property.packages", JERSEY_RESOURCES);
context.addServlet(h, "/*");

server.start();
server.join();

받은 트랙백이 없고, 댓글이 없습니다.

댓글+트랙백 RSS :: http://www.yongbi.net/rss/response/577

[개발하는 프로그램 언어를 가리지 않고, 사용하는 개발툴 가리지 않는다.]

평소 가지고 있는 생각이다.
언어가 뭐가 되었건 간에 크게 다르지 않으므로 전혀 새로운 언어라도 뚝.딱. 익혀서 프로그램을 개발할 수 있고, 개발 툴이 무엇이건 간에 그것도 뚝.딱. 익혀서 개발할 수 있다는 생각을 갖고 있기 때문이다.

하지만, 최근에 eclipse로 개발을 진행하면서 maven 프로젝트를 생성하고 작업을 하고 나니...
maven을 쓰면 굉장히 편해짐을 경험하고 maven을 사용할 수 밖에 없음!이라고 마음 한켠에 생각을 심어두고 있다.

가장 골치아픈 참조라이브러리를 알아서 관리해주고, 패키지를 말 때는 의존관계에 있는 모든 라이브러리들을 한방에 모아주는 기능까지.. (물론 eclipse maven plugin-m2 플러그인에서는 안 된다.)

그동안 생짜로 라이브러리들을 하나하나 추가하여 java 프로그램 개발했는데, 요번에 maven을 사용해 보니 개발이 굉장히 쉽다는 생각이 들었다.

역시 대세에는 이유가 있다.
받은 트랙백이 없고, 댓글이 없습니다.

댓글+트랙백 RSS :: http://www.yongbi.net/rss/response/576