Tuesday, May 8, 2012

Hadoop Leaning Note


This installation and configuration are under winXP OS
prapare three software package
1、cygwin(http://cygwin.com/setup.exe)
2、hadoop (http://mirror.bjtu.edu.cn/apache/hadoop/common/hadoop-0.20.2/hadoop-0.20.2.tar.gz)
3、jdk ( above version 6)

cygwin installed under D:\ directory
Exctract hadoop unser D:\cygwin
install jdk under C:\

and then do configaration , and below commad in.bashrc
export JAVA_HOME==/cygdrive/c/Java/jdk1.7.0_03
export PATH=$JAVA_HOME/bin:$PATH
export CLASSPATH=$JAVA_HOME/lib/tools.jar:$JAVA_HOME/lib/dt.jar

addtionally , under hadoop/conf  we need modify conf/hadoop-env.sh
configure JAVA_HOME
export JAVA_HOME=/cygdrive/c/Java/jdk1.7.0_03

configuration is done
 
$ bin/hadoop
Usage: hadoop [--config confdir] COMMAND
where COMMAND is one of:
  namenode -format     format the DFS filesystem
  secondarynamenode    run the DFS secondary namenode
  namenode             run the DFS namenode
  datanode             run a DFS datanode
  dfsadmin             run a DFS admin client
  mradmin              run a Map-Reduce admin client
  fsck                 run a DFS filesystem checking utility
  fs                   run a generic filesystem user client
  balancer             run a cluster balancing utility
  jobtracker           run the MapReduce job Tracker node
  pipes                run a Pipes job
  tasktracker          run a MapReduce task Tracker node
  job                  manipulate MapReduce jobs
  queue                get information regarding JobQueues
  version              print the version
  jar <jar>            run a jar file
  distcp <srcurl> <desturl> copy file or directories recursively
  archive -archiveName NAME <src>* <dest> create a hadoop archive
  daemonlog            get/set the log level for each daemon
or
  CLASSNAME            run the class named CLASSNAME
Most commands print help when invoked w/o parameters.


next we may run a program wordcount
1、create an input folder(program will automatically create output)
2、put some test file into input forlder
3、$ bin/hadoop  jar hadoop-0.20.2-examples.jar wordcount input output
12/03/05 04:05:43 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=
12/03/05 04:05:43 INFO input.FileInputFormat: Total input paths to process : 1
12/03/05 04:05:44 INFO mapred.JobClient: Running job: job_local_0001
12/03/05 04:05:44 INFO input.FileInputFormat: Total input paths to process : 1
12/03/05 04:05:44 INFO mapred.MapTask: io.sort.mb = 100
12/03/05 04:05:44 INFO mapred.MapTask: data buffer = 79691776/99614720
12/03/05 04:05:44 INFO mapred.MapTask: record buffer = 262144/327680
12/03/05 04:05:44 INFO mapred.MapTask: Starting flush of map output
12/03/05 04:05:44 WARN mapred.LocalJobRunner: job_local_0001
java.io.IOException: Expecting a line not the end of stream
        at org.apache.hadoop.fs.DF.parseExecResult(DF.java:109)
        at org.apache.hadoop.util.Shell.runCommand(Shell.java:179)
        at org.apache.hadoop.util.Shell.run(Shell.java:134)
        at org.apache.hadoop.fs.DF.getAvailable(DF.java:73)
        at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:329)
        at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:124)
        at org.apache.hadoop.mapred.MapOutputFile.getSpillFileForWrite(MapOutputFile.java:107)
        at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1221)
        at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1129)
        at org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:549)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:623)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
        at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
12/03/05 04:05:45 INFO mapred.JobClient:  map 0% reduce 0%
12/03/05 04:05:45 INFO mapred.JobClient: Job complete: job_local_0001
12/03/05 04:05:45 INFO mapred.JobClient: Counters: 0
above problem can be solved by configuring LANG
export LANG=en.utf8
$ bin/hadoop  jar hadoop-0.20.2-examples.jar wordcount input output
12/03/05 04:07:18 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=
12/03/05 04:07:18 INFO input.FileInputFormat: Total input paths to process : 1
12/03/05 04:07:19 INFO mapred.JobClient: Running job: job_local_0001
12/03/05 04:07:19 INFO input.FileInputFormat: Total input paths to process : 1
12/03/05 04:07:19 INFO mapred.MapTask: io.sort.mb = 100
12/03/05 04:07:19 INFO mapred.MapTask: data buffer = 79691776/99614720
12/03/05 04:07:19 INFO mapred.MapTask: record buffer = 262144/327680
12/03/05 04:07:19 INFO mapred.MapTask: Starting flush of map output
12/03/05 04:07:19 INFO mapred.MapTask: Finished spill 0
12/03/05 04:07:19 INFO mapred.TaskRunner: Task:attempt_local_0001_m_000000_0 is done. And is in the process of commiting
12/03/05 04:07:19 INFO mapred.LocalJobRunner:
12/03/05 04:07:19 INFO mapred.TaskRunner: Task 'attempt_local_0001_m_000000_0' done.
12/03/05 04:07:19 INFO mapred.LocalJobRunner:
12/03/05 04:07:19 INFO mapred.Merger: Merging 1 sorted segments
12/03/05 04:07:19 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 5204 bytes
12/03/05 04:07:19 INFO mapred.LocalJobRunner:
12/03/05 04:07:19 INFO mapred.TaskRunner: Task:attempt_local_0001_r_000000_0 is done. And is in the process of commiting
12/03/05 04:07:19 INFO mapred.LocalJobRunner:
12/03/05 04:07:19 INFO mapred.TaskRunner: Task attempt_local_0001_r_000000_0 is allowed to commit now
12/03/05 04:07:19 INFO output.FileOutputCommitter: Saved output of task 'attempt_local_0001_r_000000_0' to output
12/03/05 04:07:19 INFO mapred.LocalJobRunner: reduce > reduce
12/03/05 04:07:19 INFO mapred.TaskRunner: Task 'attempt_local_0001_r_000000_0' done.
12/03/05 04:07:20 INFO mapred.JobClient:  map 100% reduce 100%
12/03/05 04:07:20 INFO mapred.JobClient: Job complete: job_local_0001
12/03/05 04:07:20 INFO mapred.JobClient: Counters: 12
12/03/05 04:07:20 INFO mapred.JobClient:   FileSystemCounters
12/03/05 04:07:20 INFO mapred.JobClient:     FILE_BYTES_READ=325874
12/03/05 04:07:20 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=356160
12/03/05 04:07:20 INFO mapred.JobClient:   Map-Reduce Framework
12/03/05 04:07:20 INFO mapred.JobClient:     Reduce input groups=383
12/03/05 04:07:20 INFO mapred.JobClient:     Combine output records=383
12/03/05 04:07:20 INFO mapred.JobClient:     Map input records=75
12/03/05 04:07:20 INFO mapred.JobClient:     Reduce shuffle bytes=0
12/03/05 04:07:20 INFO mapred.JobClient:     Reduce output records=383
12/03/05 04:07:20 INFO mapred.JobClient:     Spilled Records=766
12/03/05 04:07:20 INFO mapred.JobClient:     Map output bytes=6912
12/03/05 04:07:20 INFO mapred.JobClient:     Combine input records=663
12/03/05 04:07:20 INFO mapred.JobClient:     Map output records=663
12/03/05 04:07:20 INFO mapred.JobClient:     Reduce input records=383



OK ! 。
look at the result
$ cat part-r-00000
"Glory  1
"Grandiose      1
"I      1
"Putin  1
"Putinism",     1
"These  1
"We     4
"every  1
"the    1
"unfair 1
"would  1
'victory'       2
(14:00  1
-       1
--------------------------------------------------------------------------------        1
17%.    1
18:00   1
2008    1
58.3%   1
6,000   1
60%     2
62.3%.  1
64%,    1
Alexey  1
Analysis        1
BBC     1
BBC:    1
Bridget 1
But     2
Continue        2
December's      1
December,       1
Diplomatic      1
Dmitry  1
ElectionRussia  1

the problem of cognos configuration for oracle database connection


1. 20:04:41, 'LogService', 'StartService', 'Success'.
2. 20:04:47, 'ContentManager', 'getActiveContentManager', 'Failure'.
DPR-CMI-4006 Unable to determine the active Content Manager. Will retry periodically.
3. 20:04:47, 'com.cognos.pogo.contentmanager.coordinator.ActiveCMControl', 'pogo', 'Failure'.
DPR-DPR-1035 Dispatcher detected an error.

4. 20:04:46, CM-CFG-5063 A Content Manager configuration error was detected while connecting to the content store.
CM-CFG-5063 A Content Manager configuration error was detected while connecting to the content store.
CM-CFG-5137 Content Manager was unable to complete the initialization of the content store. For more information, review the log file. Before you restart Content Manager, you may need to recreate the content store database or clean it using dbClean_*.sql.
5. 20:05:10, 'ContentManagerService', 'StopService', 'Success'.
6. 20:05:10, 'ContentManagerService', 'StopService', 'Success'.
7. 20:05:10, 'CPS Producer Registration Service', 'StopService', 'Success'.
8. 20:05:10, 'CPS Producer Registration Service', 'StopService', 'Success'.
9. 20:05:10, 'MonitorService', 'StopService', 'Success'.
10. 20:05:10, 'MonitorService', 'StopService', 'Success'.
11. 20:05:10, 'DeliveryService', 'StopService', 'Success'.
12. 20:05:10, 'DeliveryService', 'StopService', 'Success'.
13. 20:05:11, 'EventService', 'StopService', 'Success'.
14. 20:05:11, 'EventService', 'StopService', 'Success'.
15. 20:05:11, 'JobService', 'StopService', 'Success'.
16. 20:05:11, 'JobService', 'StopService', 'Success'.
17. 20:05:11, 'com.cognos.pogo.services.DefaultHandlerService', 'pogo', 'Failure'.
DPR-DPR-1035 Dispatcher detected an error.

18. 20:05:11, 'com.cognos.pogo.services.DefaultHandlerService', 'pogo', 'Failure'.
DPR-DPR-1035 Dispatcher detected an error.

19. 20:05:11, 'SystemService', 'StopService', 'Success'.
20. 20:05:11, 'SystemService', 'StopService', 'Success'.
21. 20:05:11, 'MetricsManagerService', 'StopService', 'Success'.
22. 20:05:11, 'MetricsManagerService', 'StopService', 'Success'.
23. 20:05:11, 'BatchReportService', 'StopService', 'Success'.
24. 20:05:11, 'BatchReportService', 'StopService', 'Success'.
25. 20:05:11, 'DataIntegrationService', 'StopService', 'Success'.
26. 20:05:11, 'DataIntegrationService', 'StopService', 'Success'.
27. 20:05:11, 'ReportService', 'StopService', 'Success'.
28. 20:05:11, 'ReportService', 'StopService', 'Success'.
29. 20:05:11, 'LogService', 'StopService', 'Success'.
30. 20:05:11, 'LogService', 'StopService', 'Success'.
31. [ ERROR ] CFG-ERR-0103 Unable to start Cognos 8 service.
Execution of the external process returns an error code value of '-1'.
how to solve this problem
1 alter oracle database character to utf8
2 create a new user and grant the user resource, connect, and dba privilage
3 restart database
4 delete the privious configuration in content manager, and setup a new one
5 happy, it can work

cognos 0-132 bi matedata model


BM COG-132 Test Information

Test information:
Number of questions: 53
Time allowed in minutes: 60
Required passing score: 71%
Former exam code was BI0-132.
Test COG-132: IBM Cognos 8 BI Metadata Model Developer
The Cognos 8 BI Metadata Model Developer exam covers key concepts, technologies, and functionality of the Cognos products. In preparation for an exam, we recommend a combination of training and hands-on experience, and a detailed review of product documentation.
Common Data Structures and Traps (16%)
Identify Common Data Structures
Identify different data traps
Framework Manager Basics (16%)
Identify the different model types that can be published from Framework Manager
Define the purpose of Framework Manager
Create a project
Identify recommendations for preparing metadata
Predictable Results (16%)
Identify recommendations to achieve predictable results
Describe why and how to implement a time dimension
Identify techniques for creating intuitive business views
Security (9%)
Describe the Cognos 8 security environment
Describe how to implement security on models and packages
Project Management and Maintenance (5%)
Identify techniques for managing Framework Manager projects
Identify techniques for managing Framework Manager packages
Query Subject Types (9%)
Identify different query subject types and uses
Generated SQL and Complex Queries (18%)
Identify SQL Generation in Complex Queries
Identify aggregated result sets for multi-fact queries
Identify aggregations in Report Studio SQL
Optimization and Tuning (5%)
Describe the performance impact of queries
Advanced Techniques (7%)
Describe how to leverage user defined functions
Describe how to resolve a recursive relationship
Identify considerations for drill-through values and managing MUNs