CHANGES.txt
上传用户:quxuerui
上传日期:2018-01-08
资源大小:41811k
文件大小:325k
- HADOOP-3443. Avoid copying map output across partitions when renaming a
- single spill. (omalley via cdouglas)
- HADOOP-3454. Fix Text::find to search only valid byte ranges. (Chad Whipkey
- via cdouglas)
- HADOOP-3417. Removes the static configuration variable,
- commandLineConfig from JobClient. Moves the cli parsing from
- JobShell to GenericOptionsParser. Thus removes the class
- org.apache.hadoop.mapred.JobShell. (Amareshwari Sriramadasu via
- ddas)
- HADOOP-2132. Only RUNNING/PREP jobs can be killed. (Jothi Padmanabhan
- via ddas)
- HADOOP-3476. Code cleanup in fuse-dfs.
- (Peter Wyckoff via dhruba)
- HADOOP-2427. Ensure that the cwd of completed tasks is cleaned-up
- correctly on task-completion. (Amareshwari Sri Ramadasu via acmurthy)
- HADOOP-2565. Remove DFSPath cache of FileStatus.
- (Tsz Wo (Nicholas), SZE via hairong)
- HADOOP-3326. Cleanup the local-fs and in-memory merge in the ReduceTask by
- spawing only one thread each for the on-disk and in-memory merge.
- (Sharad Agarwal via acmurthy)
- HADOOP-3493. Fix TestStreamingFailure to use FileUtil.fullyDelete to
- ensure correct cleanup. (Lohit Vijayarenu via acmurthy)
- HADOOP-3455. Fix NPE in ipc.Client in case of connection failure and
- improve its synchronization. (hairong)
- HADOOP-3240. Fix a testcase to not create files in the current directory.
- Instead the file is created in the test directory (Mahadev Konar via ddas)
- HADOOP-3496. Fix failure in TestHarFileSystem.testArchives due to change
- in HADOOP-3095. (tomwhite)
- HADOOP-3135. Get the system directory from the JobTracker instead of from
- the conf. (Subramaniam Krishnan via ddas)
- HADOOP-3503. Fix a race condition when client and namenode start
- simultaneous recovery of the same block. (dhruba & Tsz Wo
- (Nicholas), SZE)
- HADOOP-3440. Fixes DistributedCache to not create symlinks for paths which
- don't have fragments even when createSymLink is true.
- (Abhijit Bagri via ddas)
- HADOOP-3463. Hadoop-daemons script should cd to $HADOOP_HOME. (omalley)
- HADOOP-3489. Fix NPE in SafeModeMonitor. (Lohit Vijayarenu via shv)
- HADOOP-3509. Fix NPE in FSNamesystem.close. (Tsz Wo (Nicholas), SZE via
- shv)
- HADOOP-3491. Name-node shutdown causes InterruptedException in
- ResolutionMonitor. (Lohit Vijayarenu via shv)
- HADOOP-3511. Fixes namenode image to not set the root's quota to an
- invalid value when the quota was not saved in the image. (hairong)
- HADOOP-3516. Ensure the JobClient in HadoopArchives is initialized
- with a configuration. (Subramaniam Krishnan via omalley)
- HADOOP-3513. Improve NNThroughputBenchmark log messages. (shv)
- HADOOP-3519. Fix NPE in DFS FileSystem rename. (hairong via tomwhite)
-
- HADOOP-3528. Metrics FilesCreated and files_deleted metrics
- do not match. (Lohit via Mahadev)
- HADOOP-3418. When a directory is deleted, any leases that point to files
- in the subdirectory are removed. ((Tsz Wo (Nicholas), SZE via dhruba)
- HADOOP-3542. Diables the creation of _logs directory for the archives
- directory. (Mahadev Konar via ddas)
- HADOOP-3544. Fixes a documentation issue for hadoop archives.
- (Mahadev Konar via ddas)
- HADOOP-3517. Fixes a problem in the reducer due to which the last InMemory
- merge may be missed. (Arun Murthy via ddas)
- HADOOP-3548. Fixes build.xml to copy all *.jar files to the dist.
- (Owen O'Malley via ddas)
- HADOOP-3363. Fix unformatted storage detection in FSImage. (shv)
- HADOOP-3560. Fixes a problem to do with split creation in archives.
- (Mahadev Konar via ddas)
- HADOOP-3545. Fixes a overflow problem in archives.
- (Mahadev Konar via ddas)
- HADOOP-3561. Prevent the trash from deleting its parent directories.
- (cdouglas)
- HADOOP-3575. Fix the clover ant target after package refactoring.
- (Nigel Daley via cdouglas)
- HADOOP-3539. Fix the tool path in the bin/hadoop script under
- cygwin. (Tsz Wo (Nicholas), Sze via omalley)
- HADOOP-3520. TestDFSUpgradeFromImage triggers a race condition in the
- Upgrade Manager. Fixed. (dhruba)
- HADOOP-3586. Provide deprecated, backwards compatibile semantics for the
- combiner to be run once and only once on each record. (cdouglas)
- HADOOP-3533. Add deprecated methods to provide API compatibility
- between 0.18 and 0.17. Remove the deprecated methods in trunk. (omalley)
- HADOOP-3580. Fixes a problem to do with specifying a har as an input to
- a job. (Mahadev Konar via ddas)
- HADOOP-3333. Don't assign a task to a tasktracker that it failed to
- execute earlier (used to happen in the case of lost tasktrackers where
- the tasktracker would reinitialize and bind to a different port).
- (Jothi Padmanabhan and Arun Murthy via ddas)
- HADOOP-3534. Log IOExceptions that happen in closing the name
- system when the NameNode shuts down. (Tsz Wo (Nicholas) Sze via omalley)
- HADOOP-3546. TaskTracker re-initialization gets stuck in cleaning up.
- (Amareshwari Sriramadasu via ddas)
- HADOOP-3576. Fix NullPointerException when renaming a directory
- to its subdirectory. (Tse Wo (Nicholas), SZE via hairong)
- HADOOP-3320. Fix NullPointerException in NetworkTopology.getDistance().
- (hairong)
- HADOOP-3569. KFS input stream read() now correctly reads 1 byte
- instead of 4. (Sriram Rao via omalley)
- HADOOP-3599. Fix JobConf::setCombineOnceOnly to modify the instance rather
- than a parameter. (Owen O'Malley via cdouglas)
- HADOOP-3590. Null pointer exception in JobTracker when the task tracker is
- not yet resolved. (Amar Ramesh Kamat via ddas)
- HADOOP-3603. Fix MapOutputCollector to spill when io.sort.spill.percent is
- 1.0 and to detect spills when emitted records write no data. (cdouglas)
- HADOOP-3615. Set DatanodeProtocol.versionID to the correct value.
- (Tsz Wo (Nicholas), SZE via cdouglas)
- HADOOP-3559. Fix the libhdfs test script and config to work with the
- current semantics. (lohit vijayarenu via cdouglas)
- HADOOP-3480. Need to update Eclipse template to reflect current trunk.
- (Brice Arnould via tomwhite)
-
- HADOOP-3588. Fixed usability issues with archives. (mahadev)
- HADOOP-3635. Uncaught exception in DataBlockScanner.
- (Tsz Wo (Nicholas), SZE via hairong)
- HADOOP-3639. Exception when closing DFSClient while multiple files are
- open. (Benjamin Gufler via hairong)
- HADOOP-3572. SetQuotas usage interface has some minor bugs. (hairong)
- HADOOP-3649. Fix bug in removing blocks from the corrupted block map.
- (Lohit Vijayarenu via shv)
- HADOOP-3604. Work around a JVM synchronization problem observed while
- retrieving the address of direct buffers from compression code by obtaining
- a lock during this call. (Arun C Murthy via cdouglas)
- HADOOP-3683. Fix dfs metrics to count file listings rather than files
- listed. (lohit vijayarenu via cdouglas)
- HADOOP-3597. Fix SortValidator to use filesystems other than the default as
- input. Validation job still runs on default fs.
- (Jothi Padmanabhan via cdouglas)
- HADOOP-3693. Fix archives, distcp and native library documentation to
- conform to style guidelines. (Amareshwari Sriramadasu via cdouglas)
- HADOOP-3653. Fix test-patch target to properly account for Eclipse
- classpath jars. (Brice Arnould via nigel)
- HADOOP-3692. Fix documentation for Cluster setup and Quick start guides.
- (Amareshwari Sriramadasu via ddas)
- HADOOP-3691. Fix streaming and tutorial docs. (Jothi Padmanabhan via ddas)
- HADOOP-3630. Fix NullPointerException in CompositeRecordReader from empty
- sources (cdouglas)
- HADOOP-3706. Fix a ClassLoader issue in the mapred.join Parser that
- prevents it from loading user-specified InputFormats.
- (Jingkei Ly via cdouglas)
- HADOOP-3718. Fix KFSOutputStream::write(int) to output a byte instead of
- an int, per the OutputStream contract. (Sriram Rao via cdouglas)
- HADOOP-3647. Add debug logs to help track down a very occassional,
- hard-to-reproduce, bug in shuffle/merge on the reducer. (acmurthy)
- HADOOP-3716. Prevent listStatus in KosmosFileSystem from returning
- null for valid, empty directories. (Sriram Rao via cdouglas)
- HADOOP-3752. Fix audit logging to record rename events. (cdouglas)
- HADOOP-3737. Fix CompressedWritable to call Deflater::end to release
- compressor memory. (Grant Glouser via cdouglas)
- HADOOP-3670. Fixes JobTracker to clear out split bytes when no longer
- required. (Amareshwari Sriramadasu via ddas)
- HADOOP-3755. Update gridmix to work with HOD 0.4 (Runping Qi via cdouglas)
-
- HADOOP-3743. Fix -libjars, -files, -archives options to work even if
- user code does not implement tools. (Amareshwari Sriramadasu via mahadev)
- HADOOP-3774. Fix typos in shell output. (Tsz Wo (Nicholas), SZE via
- cdouglas)
- HADOOP-3762. Fixed FileSystem cache to work with the default port. (cutting
- via omalley)
- HADOOP-3798. Fix tests compilation. (Mukund Madhugiri via omalley)
- HADOOP-3794. Return modification time instead of zero for KosmosFileSystem.
- (Sriram Rao via cdouglas)
- HADOOP-3806. Remove debug statement to stdout from QuickSort. (cdouglas)
- HADOOP-3776. Fix NPE at NameNode when datanode reports a block after it is
- deleted at NameNode. (rangadi)
- HADOOP-3537. Disallow adding a datanode to a network topology when its
- network location is not resolved. (hairong)
- HADOOP-3571. Fix bug in block removal used in lease recovery. (shv)
- HADOOP-3645. MetricsTimeVaryingRate returns wrong value for
- metric_avg_time. (Lohit Vijayarenu via hairong)
- HADOOP-3521. Reverted the missing cast to float for sending Counters' values
- to Hadoop metrics which was removed by HADOOP-544. (acmurthy)
- HADOOP-3820. Fixes two problems in the gridmix-env - a syntax error, and a
- wrong definition of USE_REAL_DATASET by default. (Arun Murthy via ddas)
- HADOOP-3724. Fixes two problems related to storing and recovering lease
- in the fsimage. (dhruba)
-
- HADOOP-3827. Fixed compression of empty map-outputs. (acmurthy)
- HADOOP-3865. Remove reference to FSNamesystem from metrics preventing
- garbage collection. (Lohit Vijayarenu via cdouglas)
- HADOOP-3884. Fix so that Eclipse plugin builds against recent
- Eclipse releases. (cutting)
- HADOOP-3837. Streaming jobs report progress status. (dhruba)
- HADOOP-3897. Fix a NPE in secondary namenode. (Lohit Vijayarenu via
- cdouglas)
- HADOOP-3901. Fix bin/hadoop to correctly set classpath under cygwin.
- (Tsz Wo (Nicholas) Sze via omalley)
- HADOOP-3947. Fix a problem in tasktracker reinitialization.
- (Amareshwari Sriramadasu via ddas)
- Release 0.17.3 - Unreleased
- IMPROVEMENTS
- HADOOP-4164. Chinese translation of the documentation. (Xuebing Yan via
- omalley)
- BUG FIXES
- HADOOP-4277. Checksum verification was mistakenly disabled for
- LocalFileSystem. (Raghu Angadi)
- HADOOP-4271. Checksum input stream can sometimes return invalid
- data to the user. (Ning Li via rangadi)
- HADOOP-4318. DistCp should use absolute paths for cleanup. (szetszwo)
- HADOOP-4326. ChecksumFileSystem does not override create(...) correctly.
- (szetszwo)
- Release 0.17.2 - 2008-08-11
- BUG FIXES
- HADOOP-3678. Avoid spurious exceptions logged at DataNode when clients
- read from DFS. (rangadi)
- HADOOP-3707. NameNode keeps a count of number of blocks scheduled
- to be written to a datanode and uses it to avoid allocating more
- blocks than a datanode can hold. (rangadi)
- HADOOP-3760. Fix a bug with HDFS file close() mistakenly introduced
- by HADOOP-3681. (Lohit Vijayarenu via rangadi)
- HADOOP-3681. DFSClient can get into an infinite loop while closing
- a file if there are some errors. (Lohit Vijayarenu via rangadi)
- HADOOP-3002. Hold off block removal while in safe mode. (shv)
- HADOOP-3685. Unbalanced replication target. (hairong)
- HADOOP-3758. Shutdown datanode on version mismatch instead of retrying
- continuously, preventing excessive logging at the namenode.
- (lohit vijayarenu via cdouglas)
- HADOOP-3633. Correct exception handling in DataXceiveServer, and throttle
- the number of xceiver threads in a data-node. (shv)
- HADOOP-3370. Ensure that the TaskTracker.runningJobs data-structure is
- correctly cleaned-up on task completion. (Zheng Shao via acmurthy)
- HADOOP-3813. Fix task-output clean-up on HDFS to use the recursive
- FileSystem.delete rather than the FileUtil.fullyDelete. (Amareshwari
- Sri Ramadasu via acmurthy)
- HADOOP-3859. Allow the maximum number of xceivers in the data node to
- be configurable. (Johan Oskarsson via omalley)
- HADOOP-3931. Fix corner case in the map-side sort that causes some values
- to be counted as too large and cause pre-mature spills to disk. Some values
- will also bypass the combiner incorrectly. (cdouglas via omalley)
- Release 0.17.1 - 2008-06-23
- INCOMPATIBLE CHANGES
- HADOOP-3565. Fix the Java serialization, which is not enabled by
- default, to clear the state of the serializer between objects.
- (tomwhite via omalley)
- IMPROVEMENTS
- HADOOP-3522. Improve documentation on reduce pointing out that
- input keys and values will be reused. (omalley)
- HADOOP-3487. Balancer uses thread pools for managing its threads;
- therefore provides better resource management. (hairong)
- BUG FIXES
- HADOOP-2159 Namenode stuck in safemode. The counter blockSafe should
- not be decremented for invalid blocks. (hairong)
- HADOOP-3472 MapFile.Reader getClosest() function returns incorrect results
- when before is true (Todd Lipcon via Stack)
- HADOOP-3442. Limit recursion depth on the stack for QuickSort to prevent
- StackOverflowErrors. To avoid O(n*n) cases, when partitioning depth exceeds
- a multiple of log(n), change to HeapSort. (cdouglas)
- HADOOP-3477. Fix build to not package contrib/*/bin twice in
- distributions. (Adam Heath via cutting)
- HADOOP-3475. Fix MapTask to correctly size the accounting allocation of
- io.sort.mb. (cdouglas)
- HADOOP-3550. Fix the serialization data structures in MapTask where the
- value lengths are incorrectly calculated. (cdouglas)
- HADOOP-3526. Fix contrib/data_join framework by cloning values retained
- in the reduce. (Spyros Blanas via cdouglas)
- HADOOP-1979. Speed up fsck by adding a buffered stream. (Lohit
- Vijaya Renu via omalley)
- Release 0.17.0 - 2008-05-18
- INCOMPATIBLE CHANGES
- HADOOP-2786. Move hbase out of hadoop core
- HADOOP-2345. New HDFS transactions to support appending
- to files. Disk layout version changed from -11 to -12. (dhruba)
- HADOOP-2192. Error messages from "dfs mv" command improved.
- (Mahadev Konar via dhruba)
- HADOOP-1902. "dfs du" command without any arguments operates on the
- current working directory. (Mahadev Konar via dhruba)
- HADOOP-2873. Fixed bad disk format introduced by HADOOP-2345.
- Disk layout version changed from -12 to -13. See changelist 630992
- (dhruba)
- HADOOP-1985. This addresses rack-awareness for Map tasks and for
- HDFS in a uniform way. (ddas)
- HADOOP-1986. Add support for a general serialization mechanism for
- Map Reduce. (tomwhite)
- HADOOP-771. FileSystem.delete() takes an explicit parameter that
- specifies whether a recursive delete is intended.
- (Mahadev Konar via dhruba)
- HADOOP-2470. Remove getContentLength(String), open(String, long, long)
- and isDir(String) from ClientProtocol. ClientProtocol version changed
- from 26 to 27. (Tsz Wo (Nicholas), SZE via cdouglas)
- HADOOP-2822. Remove deprecated code for classes InputFormatBase and
- PhasedFileSystem. (Amareshwari Sriramadasu via enis)
- HADOOP-2116. Changes the layout of the task execution directory.
- (Amareshwari Sriramadasu via ddas)
- HADOOP-2828. The following deprecated methods in Configuration.java
- have been removed
- getObject(String name)
- setObject(String name, Object value)
- get(String name, Object defaultValue)
- set(String name, Object value)
- Iterator entries()
- (Amareshwari Sriramadasu via ddas)
- HADOOP-2824. Removes one deprecated constructor from MiniMRCluster.
- (Amareshwari Sriramadasu via ddas)
- HADOOP-2823. Removes deprecated methods getColumn(), getLine() from
- org.apache.hadoop.record.compiler.generated.SimpleCharStream.
- (Amareshwari Sriramadasu via ddas)
- HADOOP-3060. Removes one unused constructor argument from MiniMRCluster.
- (Amareshwari Sriramadasu via ddas)
- HADOOP-2854. Remove deprecated o.a.h.ipc.Server::getUserInfo().
- (lohit vijayarenu via cdouglas)
- HADOOP-2563. Remove deprecated FileSystem::listPaths.
- (lohit vijayarenu via cdouglas)
- HADOOP-2818. Remove deprecated methods in Counters.
- (Amareshwari Sriramadasu via tomwhite)
- HADOOP-2831. Remove deprecated o.a.h.dfs.INode::getAbsoluteName()
- (lohit vijayarenu via cdouglas)
- HADOOP-2839. Remove deprecated FileSystem::globPaths.
- (lohit vijayarenu via cdouglas)
- HADOOP-2634. Deprecate ClientProtocol::exists.
- (lohit vijayarenu via cdouglas)
- HADOOP-2410. Make EC2 cluster nodes more independent of each other.
- Multiple concurrent EC2 clusters are now supported, and nodes may be
- added to a cluster on the fly with new nodes starting in the same EC2
- availability zone as the cluster. Ganglia monitoring and large
- instance sizes have also been added. (Chris K Wensel via tomwhite)
- HADOOP-2826. Deprecated FileSplit.getFile(), LineRecordReader.readLine().
- (Amareshwari Sriramadasu via ddas)
- HADOOP-3239. getFileInfo() returns null for non-existing files instead
- of throwing FileNotFoundException. (Lohit Vijayarenu via shv)
- HADOOP-3266. Removed HOD changes from CHANGES.txt, as they are now inside
- src/contrib/hod (Hemanth Yamijala via ddas)
- HADOOP-3280. Separate the configuration of the virtual memory size
- (mapred.child.ulimit) from the jvm heap size, so that 64 bit
- streaming applications are supported even when running with 32 bit
- jvms. (acmurthy via omalley)
- NEW FEATURES
- HADOOP-1398. Add HBase in-memory block cache. (tomwhite)
- HADOOP-2178. Job History on DFS. (Amareshwari Sri Ramadasu via ddas)
- HADOOP-2063. A new parameter to dfs -get command to fetch a file
- even if it is corrupted. (Tsz Wo (Nicholas), SZE via dhruba)
- HADOOP-2219. A new command "df -count" that counts the number of
- files and directories. (Tsz Wo (Nicholas), SZE via dhruba)
- HADOOP-2906. Add an OutputFormat capable of using keys, values, and
- config params to map records to different output files.
- (Runping Qi via cdouglas)
- HADOOP-2346. Utilities to support timeout while writing to sockets.
- DFSClient and DataNode sockets have 10min write timeout. (rangadi)
-
- HADOOP-2951. Add a contrib module that provides a utility to
- build or update Lucene indexes using Map/Reduce. (Ning Li via cutting)
- HADOOP-1622. Allow multiple jar files for map reduce.
- (Mahadev Konar via dhruba)
- HADOOP-2055. Allows users to set PathFilter on the FileInputFormat.
- (Alejandro Abdelnur via ddas)
- HADOOP-2551. More environment variables like HADOOP_NAMENODE_OPTS
- for better control of HADOOP_OPTS for each component. (rangadi)
- HADOOP-3001. Add job counters that measure the number of bytes
- read and written to HDFS, S3, KFS, and local file systems. (omalley)
- HADOOP-3048. A new Interface and a default implementation to convert
- and restore serializations of objects to/from strings. (enis)
- IMPROVEMENTS
- HADOOP-2655. Copy on write for data and metadata files in the
- presence of snapshots. Needed for supporting appends to HDFS
- files. (dhruba)
- HADOOP-1967. When a Path specifies the same scheme as the default
- FileSystem but no authority, the default FileSystem's authority is
- used. Also add warnings for old-format FileSystem names, accessor
- methods for fs.default.name, and check for null authority in HDFS.
- (cutting)
- HADOOP-2895. Let the profiling string be configurable.
- (Martin Traverso via cdouglas)
- HADOOP-910. Enables Reduces to do merges for the on-disk map output files
- in parallel with their copying. (Amar Kamat via ddas)
- HADOOP-730. Use rename rather than copy for local renames. (cdouglas)
- HADOOP-2810. Updated the Hadoop Core logo. (nigel)
- HADOOP-2057. Streaming should optionally treat a non-zero exit status
- of a child process as a failed task. (Rick Cox via tomwhite)
- HADOOP-2765. Enables specifying ulimits for streaming/pipes tasks (ddas)
- HADOOP-2888. Make gridmix scripts more readily configurable and amenable
- to automated execution. (Mukund Madhugiri via cdouglas)
- HADOOP-2908. A document that describes the DFS Shell command.
- (Mahadev Konar via dhruba)
- HADOOP-2981. Update README.txt to reflect the upcoming use of
- cryptography. (omalley)
- HADOOP-2804. Add support to publish CHANGES.txt as HTML when running
- the Ant 'docs' target. (nigel)
- HADOOP-2559. Change DFS block placement to allocate the first replica
- locally, the second off-rack, and the third intra-rack from the
- second. (lohit vijayarenu via cdouglas)
- HADOOP-2939. Make the automated patch testing process an executable
- Ant target, test-patch. (nigel)
- HADOOP-2239. Add HsftpFileSystem to permit transferring files over ssl.
- (cdouglas)
- HADOOP-2886. Track individual RPC metrics.
- (girish vaitheeswaran via dhruba)
- HADOOP-2373. Improvement in safe-mode reporting. (shv)
- HADOOP-3091. Modify FsShell command -put to accept multiple sources.
- (Lohit Vijaya Renu via cdouglas)
- HADOOP-3092. Show counter values from job -status command.
- (Tom White via ddas)
- HADOOP-1228. Ant task to generate Eclipse project files. (tomwhite)
- HADOOP-3093. Adds Configuration.getStrings(name, default-value) and
- the corresponding setStrings. (Amareshwari Sriramadasu via ddas)
- HADOOP-3106. Adds documentation in forrest for debugging.
- (Amareshwari Sriramadasu via ddas)
- HADOOP-3099. Add an option to distcp to preserve user, group, and
- permission information. (Tsz Wo (Nicholas), SZE via cdouglas)
- HADOOP-2841. Unwrap AccessControlException and FileNotFoundException
- from RemoteException for DFSClient. (shv)
- HADOOP-3152. Make index interval configuable when using
- MapFileOutputFormat for map-reduce job. (Rong-En Fan via cutting)
- HADOOP-3143. Decrease number of slaves from 4 to 3 in TestMiniMRDFSSort,
- as Hudson generates false negatives under the current load.
- (Nigel Daley via cdouglas)
- HADOOP-3174. Illustrative example for MultipleFileInputFormat. (Enis
- Soztutar via acmurthy)
- HADOOP-2993. Clarify the usage of JAVA_HOME in the Quick Start guide.
- (acmurthy via nigel)
- HADOOP-3124. Make DataNode socket write timeout configurable. (rangadi)
- OPTIMIZATIONS
- HADOOP-2790. Fixed inefficient method hasSpeculativeTask by removing
- repetitive calls to get the current time and late checking to see if
- we want speculation on at all. (omalley)
- HADOOP-2758. Reduce buffer copies in DataNode when data is read from
- HDFS, without negatively affecting read throughput. (rangadi)
- HADOOP-2399. Input key and value to combiner and reducer is reused.
- (Owen O'Malley via ddas).
- HADOOP-2423. Code optimization in FSNamesystem.mkdirs.
- (Tsz Wo (Nicholas), SZE via dhruba)
- HADOOP-2606. ReplicationMonitor selects data-nodes to replicate directly
- from needed replication blocks instead of looking up for the blocks for
- each live data-node. (shv)
- HADOOP-2148. Eliminate redundant data-node blockMap lookups. (shv)
- HADOOP-2027. Return the number of bytes in each block in a file
- via a single rpc to the namenode to speed up job planning.
- (Lohit Vijaya Renu via omalley)
- HADOOP-2902. Replace uses of "fs.default.name" with calls to the
- accessor methods added in HADOOP-1967. (cutting)
- HADOOP-2119. Optimize scheduling of jobs with large numbers of
- tasks by replacing static arrays with lists of runnable tasks.
- (Amar Kamat via omalley)
- HADOOP-2919. Reduce the number of memory copies done during the
- map output sorting. Also adds two config variables:
- io.sort.spill.percent - the percentages of io.sort.mb that should
- cause a spill (default 80%)
- io.sort.record.percent - the percent of io.sort.mb that should
- hold key/value indexes (default 5%)
- (cdouglas via omalley)
- HADOOP-3140. Doesn't add a task in the commit queue if the task hadn't
- generated any output. (Amar Kamat via ddas)
- HADOOP-3168. Reduce the amount of logging in streaming to an
- exponentially increasing number of records (up to 10,000
- records/log). (Zheng Shao via omalley)
-
- BUG FIXES
- HADOOP-2195. '-mkdir' behaviour is now closer to Linux shell in case of
- errors. (Mahadev Konar via rangadi)
-
- HADOOP-2190. bring behaviour '-ls' and '-du' closer to Linux shell
- commands in case of errors. (Mahadev Konar via rangadi)
-
- HADOOP-2193. 'fs -rm' and 'fs -rmr' show error message when the target
- file does not exist. (Mahadev Konar via rangadi)
-
- HADOOP-2738 Text is not subclassable because set(Text) and compareTo(Object)
- access the other instance's private members directly. (jimk)
- HADOOP-2779. Remove the references to HBase in the build.xml. (omalley)
- HADOOP-2194. dfs cat on a non-existent file throws FileNotFoundException.
- (Mahadev Konar via dhruba)
- HADOOP-2767. Fix for NetworkTopology erroneously skipping the last leaf
- node on a rack. (Hairong Kuang and Mark Butler via dhruba)
- HADOOP-1593. FsShell works with paths in non-default FileSystem.
- (Mahadev Konar via dhruba)
- HADOOP-2191. du and dus command on non-existent directory gives
- appropriate error message. (Mahadev Konar via dhruba)
- HADOOP-2832. Remove tabs from code of DFSClient for better
- indentation. (dhruba)
- HADOOP-2844. distcp closes file handles for sequence files.
- (Tsz Wo (Nicholas), SZE via dhruba)
- HADOOP-2727. Fix links in Web UI of the hadoop daemons and some docs
- (Amareshwari Sri Ramadasu via ddas)
- HADOOP-2871. Fixes a problem to do with file: URI in the JobHistory init.
- (Amareshwari Sri Ramadasu via ddas)
- HADOOP-2800. Deprecate SetFile.Writer constructor not the whole class.
- (Johan Oskarsson via tomwhite)
- HADOOP-2891. DFSClient.close() closes all open files. (dhruba)
- HADOOP-2845. Fix dfsadmin disk utilization report on Solaris.
- (Martin Traverso via tomwhite)
- HADOOP-2912. MiniDFSCluster restart should wait for namenode to exit
- safemode. This was causing TestFsck to fail. (Mahadev Konar via dhruba)
- HADOOP-2820. The following classes in streaming are removed :
- StreamLineRecordReader StreamOutputFormat StreamSequenceRecordReader.
- (Amareshwari Sri Ramadasu via ddas)
- HADOOP-2819. The following methods in JobConf are removed:
- getInputKeyClass() setInputKeyClass getInputValueClass()
- setInputValueClass(Class theClass) setSpeculativeExecution
- getSpeculativeExecution() (Amareshwari Sri Ramadasu via ddas)
- HADOOP-2817. Removes deprecated mapred.tasktracker.tasks.maximum and
- ClusterStatus.getMaxTasks(). (Amareshwari Sri Ramadasu via ddas)
- HADOOP-2821. Removes deprecated ShellUtil and ToolBase classes from
- the util package. (Amareshwari Sri Ramadasu via ddas)
- HADOOP-2934. The namenode was encountreing a NPE while loading
- leases from the fsimage. Fixed. (dhruba)
- HADOOP-2938. Some fs commands did not glob paths.
- (Tsz Wo (Nicholas), SZE via rangadi)
- HADOOP-2943. Compression of intermediate map output causes failures
- in the merge. (cdouglas)
- HADOOP-2870. DataNode and NameNode closes all connections while
- shutting down. (Hairong Kuang via dhruba)
- HADOOP-2973. Fix TestLocalDFS for Windows platform.
- (Tsz Wo (Nicholas), SZE via dhruba)
- HADOOP-2971. select multiple times if it returns early in
- SocketIOWithTimeout. (rangadi)
- HADOOP-2955. Fix TestCrcCorruption test failures caused by HADOOP-2758
- (rangadi)
- HADOOP-2657. A flush call on the DFSOutputStream flushes the last
- partial CRC chunk too. (dhruba)
- HADOOP-2974. IPC unit tests used "0.0.0.0" to connect to server, which
- is not always supported. (rangadi)
- HADOOP-2996. Fixes uses of StringBuffer in StreamUtils class.
- (Dave Brosius via ddas)
- HADOOP-2995. Fixes StreamBaseRecordReader's getProgress to return a
- floating point number. (Dave Brosius via ddas)
- HADOOP-2972. Fix for a NPE in FSDataset.invalidate.
- (Mahadev Konar via dhruba)
- HADOOP-2994. Code cleanup for DFSClient: remove redundant
- conversions from string to string. (Dave Brosius via dhruba)
- HADOOP-3009. TestFileCreation sometimes fails because restarting
- minidfscluster sometimes creates datanodes with ports that are
- different from their original instance. (dhruba)
- HADOOP-2992. Distributed Upgrade framework works correctly with
- more than one upgrade object. (Konstantin Shvachko via dhruba)
- HADOOP-2679. Fix a typo in libhdfs. (Jason via dhruba)
- HADOOP-2976. When a lease expires, the Namenode ensures that
- blocks of the file are adequately replicated. (dhruba)
- HADOOP-2901. Fixes the creation of info servers in the JobClient
- and JobTracker. Removes the creation from JobClient and removes
- additional info server from the JobTracker. Also adds the command
- line utility to view the history files (HADOOP-2896), and fixes
- bugs in JSPs to do with analysis - HADOOP-2742, HADOOP-2792.
- (Amareshwari Sri Ramadasu via ddas)
- HADOOP-2890. If different datanodes report the same block but
- with different sizes to the namenode, the namenode picks the
- replica(s) with the largest size as the only valid replica(s). (dhruba)
- HADOOP-2825. Deprecated MapOutputLocation.getFile() is removed.
- (Amareshwari Sri Ramadasu via ddas)
- HADOOP-2806. Fixes a streaming document.
- (Amareshwari Sriramadasu via ddas)
- HADOOP-3008. SocketIOWithTimeout throws InterruptedIOException if the
- thread is interrupted while it is waiting. (rangadi)
-
- HADOOP-3006. Fix wrong packet size reported by DataNode when a block
- is being replicated. (rangadi)
- HADOOP-3029. Datanode prints log message "firstbadlink" only if
- it detects a bad connection to another datanode in the pipeline. (dhruba)
- HADOOP-3030. Release reserved space for file in InMemoryFileSystem if
- checksum reservation fails. (Devaraj Das via cdouglas)
- HADOOP-3036. Fix findbugs warnings in UpgradeUtilities. (Konstantin
- Shvachko via cdouglas)
- HADOOP-3025. ChecksumFileSystem supports the delete method with
- the recursive flag. (Mahadev Konar via dhruba)
- HADOOP-3012. dfs -mv file to user home directory throws exception if
- the user home directory does not exist. (Mahadev Konar via dhruba)
-
- HADOOP-3066. Should not require superuser privilege to query if hdfs is in
- safe mode (jimk)
- HADOOP-3040. If the input line starts with the separator char, the key
- is set as empty. (Amareshwari Sriramadasu via ddas)
- HADOOP-3080. Removes flush calls from JobHistory.
- (Amareshwari Sriramadasu via ddas)
- HADOOP-3086. Adds the testcase missed during commit of hadoop-3040.
- (Amareshwari Sriramadasu via ddas)
- HADOOP-3046. Fix the raw comparators for Text and BytesWritables
- to use the provided length rather than recompute it. (omalley)
- HADOOP-3094. Fix BytesWritable.toString to avoid extending the sign bit
- (Owen O'Malley via cdouglas)
- HADOOP-3067. DFSInputStream's position read does not close the sockets.
- (rangadi)
- HADOOP-3073. close() on SocketInputStream or SocketOutputStream should
- close the underlying channel. (rangadi)
- HADOOP-3087. Fixes a problem to do with refreshing of loadHistory.jsp.
- (Amareshwari Sriramadasu via ddas)
- HADOOP-3065. Better logging message if the rack location of a datanode
- cannot be determined. (Devaraj Das via dhruba)
- HADOOP-3064. Commas in a file path should not be treated as delimiters.
- (Hairong Kuang via shv)
- HADOOP-2997. Adds test for non-writable serialier. Also fixes a problem
- introduced by HADOOP-2399. (Tom White via ddas)
- HADOOP-3114. Fix TestDFSShell on Windows. (Lohit Vijaya Renu via cdouglas)
- HADOOP-3118. Fix Namenode NPE while loading fsimage after a cluster
- upgrade from older disk format. (dhruba)
- HADOOP-3161. Fix FIleUtil.HardLink.getLinkCount on Mac OS. (nigel
- via omalley)
- HADOOP-2927. Fix TestDU to acurately calculate the expected file size.
- (shv via nigel)
- HADOOP-3123. Fix the native library build scripts to work on Solaris.
- (tomwhite via omalley)
- HADOOP-3089. Streaming should accept stderr from task before
- first key arrives. (Rick Cox via tomwhite)
- HADOOP-3146. A DFSOutputStream.flush method is renamed as
- DFSOutputStream.fsync. (dhruba)
- HADOOP-3165. -put/-copyFromLocal did not treat input file "-" as stdin.
- (Lohit Vijayarenu via rangadi)
- HADOOP-3041. Deprecate JobConf.setOutputPath and JobConf.getOutputPath.
- Deprecate OutputFormatBase. Add FileOutputFormat. Existing output formats
- extending OutputFormatBase, now extend FileOutputFormat. Add the following
- APIs in FileOutputFormat: setOutputPath, getOutputPath, getWorkOutputPath.
- (Amareshwari Sriramadasu via nigel)
- HADOOP-3083. The fsimage does not store leases. This would have to be
- reworked in the next release to support appends. (dhruba)
- HADOOP-3166. Fix an ArrayIndexOutOfBoundsException in the spill thread
- and make exception handling more promiscuous to catch this condition.
- (cdouglas)
- HADOOP-3050. DataNode sends one and only one block report after
- it registers with the namenode. (Hairong Kuang)
- HADOOP-3044. NNBench sets the right configuration for the mapper.
- (Hairong Kuang)
- HADOOP-3178. Fix GridMix scripts for small and medium jobs
- to handle input paths differently. (Mukund Madhugiri via nigel)
- HADOOP-1911. Fix an infinite loop in DFSClient when all replicas of a
- block are bad (cdouglas)
- HADOOP-3157. Fix path handling in DistributedCache and TestMiniMRLocalFS.
- (Doug Cutting via rangadi)
- HADOOP-3018. Fix the eclipse plug-in contrib wrt removed deprecated
- methods (taton)
- HADOOP-3183. Fix TestJobShell to use 'ls' instead of java.io.File::exists
- since cygwin symlinks are unsupported.
- (Mahadev konar via cdouglas)
- HADOOP-3175. Fix FsShell.CommandFormat to handle "-" in arguments.
- (Edward J. Yoon via rangadi)
- HADOOP-3220. Safemode message corrected. (shv)
- HADOOP-3208. Fix WritableDeserializer to set the Configuration on
- deserialized Writables. (Enis Soztutar via cdouglas)
- HADOOP-3224. 'dfs -du /dir' does not return correct size.
- (Lohit Vjayarenu via rangadi)
- HADOOP-3223. Fix typo in help message for -chmod. (rangadi)
- HADOOP-1373. checkPath() should ignore case when it compares authoriy.
- (Edward J. Yoon via rangadi)
- HADOOP-3204. Fixes a problem to do with ReduceTask's LocalFSMerger not
- catching Throwable. (Amar Ramesh Kamat via ddas)
- HADOOP-3229. Report progress when collecting records from the mapper and
- the combiner. (Doug Cutting via cdouglas)
- HADOOP-3225. Unwrapping methods of RemoteException should initialize
- detailedMassage field. (Mahadev Konar, shv, cdouglas)
- HADOOP-3247. Fix gridmix scripts to use the correct globbing syntax and
- change maxentToSameCluster to run the correct number of jobs.
- (Runping Qi via cdouglas)
- HADOOP-3242. Fix the RecordReader of SequenceFileAsBinaryInputFormat to
- correctly read from the start of the split and not the beginning of the
- file. (cdouglas via acmurthy)
- HADOOP-3256. Encodes the job name used in the filename for history files.
- (Arun Murthy via ddas)
- HADOOP-3162. Ensure that comma-separated input paths are treated correctly
- as multiple input paths. (Amareshwari Sri Ramadasu via acmurthy)
- HADOOP-3263. Ensure that the job-history log file always follows the
- pattern of hostname_timestamp_jobid_username_jobname even if username
- and/or jobname are not specfied. This helps to avoid wrong assumptions
- made about the job-history log filename in jobhistory.jsp. (acmurthy)
- HADOOP-3251. Fixes getFilesystemName in JobTracker and LocalJobRunner to
- use FileSystem.getUri instead of FileSystem.getName. (Arun Murthy via ddas)
- HADOOP-3237. Fixes TestDFSShell.testErrOutPut on Windows platform.
- (Mahadev Konar via ddas)
- HADOOP-3279. TaskTracker checks for SUCCEEDED task status in addition to
- COMMIT_PENDING status when it fails maps due to lost map.
- (Devaraj Das)
- HADOOP-3286. Prevent collisions in gridmix output dirs by increasing the
- granularity of the timestamp. (Runping Qi via cdouglas)
- HADOOP-3285. Fix input split locality when the splits align to
- fs blocks. (omalley)
- HADOOP-3372. Fix heap management in streaming tests. (Arun Murthy via
- cdouglas)
- HADOOP-3031. Fix javac warnings in test classes. (cdouglas)
- HADOOP-3382. Fix memory leak when files are not cleanly closed (rangadi)
- HADOOP-3322. Fix to push MetricsRecord for rpc metrics. (Eric Yang via
- mukund)
- Release 0.16.4 - 2008-05-05
- BUG FIXES
- HADOOP-3138. DFS mkdirs() should not throw an exception if the directory
- already exists. (rangadi via mukund)
- HADOOP-3294. Fix distcp to check the destination length and retry the copy
- if it doesn't match the src length. (Tsz Wo (Nicholas), SZE via mukund)
- HADOOP-3186. Fix incorrect permission checkding for mv and renameTo
- in HDFS. (Tsz Wo (Nicholas), SZE via mukund)
- Release 0.16.3 - 2008-04-16
- BUG FIXES
- HADOOP-3010. Fix ConcurrentModificationException in ipc.Server.Responder.
- (rangadi)
- HADOOP-3154. Catch all Throwables from the SpillThread in MapTask, rather
- than IOExceptions only. (ddas via cdouglas)
- HADOOP-3159. Avoid file system cache being overwritten whenever
- configuration is modified. (Tsz Wo (Nicholas), SZE via hairong)
- HADOOP-3139. Remove the consistency check for the FileSystem cache in
- closeAll() that causes spurious warnings and a deadlock.
- (Tsz Wo (Nicholas), SZE via cdouglas)
- HADOOP-3195. Fix TestFileSystem to be deterministic.
- (Tsz Wo (Nicholas), SZE via cdouglas)
- HADOOP-3069. Primary name-node should not truncate image when transferring
- it from the secondary. (shv)
- HADOOP-3182. Change permissions of the job-submission directory to 777
- from 733 to ensure sharing of HOD clusters works correctly. (Tsz Wo
- (Nicholas), Sze and Amareshwari Sri Ramadasu via acmurthy)
- Release 0.16.2 - 2008-04-02
- BUG FIXES
- HADOOP-3011. Prohibit distcp from overwriting directories on the
- destination filesystem with files. (cdouglas)
- HADOOP-3033. The BlockReceiver thread in the datanode writes data to
- the block file, changes file position (if needed) and flushes all by
- itself. The PacketResponder thread does not flush block file. (dhruba)
- HADOOP-2978. Fixes the JobHistory log format for counters.
- (Runping Qi via ddas)
- HADOOP-2985. Fixes LocalJobRunner to tolerate null job output path.
- Also makes the _temporary a constant in MRConstants.java.
- (Amareshwari Sriramadasu via ddas)
- HADOOP-3003. FileSystem cache key is updated after a
- FileSystem object is created. (Tsz Wo (Nicholas), SZE via dhruba)
- HADOOP-3042. Updates the Javadoc in JobConf.getOutputPath to reflect
- the actual temporary path. (Amareshwari Sriramadasu via ddas)
- HADOOP-3007. Tolerate mirror failures while DataNode is replicating
- blocks as it used to before. (rangadi)
- HADOOP-2944. Fixes a "Run on Hadoop" wizard NPE when creating a
- Location from the wizard. (taton)
- HADOOP-3049. Fixes a problem in MultiThreadedMapRunner to do with
- catching RuntimeExceptions. (Alejandro Abdelnur via ddas)
- HADOOP-3039. Fixes a problem to do with exceptions in tasks not
- killing jobs. (Amareshwari Sriramadasu via ddas)
- HADOOP-3027. Fixes a problem to do with adding a shutdown hook in
- FileSystem. (Amareshwari Sriramadasu via ddas)
- HADOOP-3056. Fix distcp when the target is an empty directory by
- making sure the directory is created first. (cdouglas and acmurthy
- via omalley)
- HADOOP-3070. Protect the trash emptier thread from null pointer
- exceptions. (Koji Noguchi via omalley)
- HADOOP-3084. Fix HftpFileSystem to work for zero-lenghth files.
- (cdouglas)
- HADOOP-3107. Fix NPE when fsck invokes getListings. (dhruba)
- HADOOP-3104. Limit MultithreadedMapRunner to have a fixed length queue
- between the RecordReader and the map threads. (Alejandro Abdelnur via
- omalley)
- HADOOP-2833. Do not use "Dr. Who" as the default user in JobClient.
- A valid user name is required. (Tsz Wo (Nicholas), SZE via rangadi)
- HADOOP-3128. Throw RemoteException in setPermissions and setOwner of
- DistributedFileSystem. (shv via nigel)
- Release 0.16.1 - 2008-03-13
- INCOMPATIBLE CHANGES
- HADOOP-2869. Deprecate SequenceFile.setCompressionType in favor of
- SequenceFile.createWriter, SequenceFileOutputFormat.setCompressionType,
- and JobConf.setMapOutputCompressionType. (Arun C Murthy via cdouglas)
- Configuration changes to hadoop-default.xml:
- deprecated io.seqfile.compression.type
- IMPROVEMENTS
- HADOOP-2371. User guide for file permissions in HDFS.
- (Robert Chansler via rangadi)
- HADOOP-3098. Allow more characters in user and group names while
- using -chown and -chgrp commands. (rangadi)
-
- BUG FIXES
- HADOOP-2789. Race condition in IPC Server Responder that could close
- connections early. (Raghu Angadi)
-
- HADOOP-2785. minor. Fix a typo in Datanode block verification
- (Raghu Angadi)
-
- HADOOP-2788. minor. Fix help message for chgrp shell command (Raghu Angadi).
-
- HADOOP-1188. fstime file is updated when a storage directory containing
- namespace image becomes inaccessible. (shv)
- HADOOP-2787. An application can set a configuration variable named
- dfs.umask to set the umask that is used by DFS.
- (Tsz Wo (Nicholas), SZE via dhruba)
- HADOOP-2780. The default socket buffer size for DataNodes is 128K.
- (dhruba)
- HADOOP-2716. Superuser privileges for the Balancer.
- (Tsz Wo (Nicholas), SZE via shv)
- HADOOP-2754. Filter out .crc files from local file system listing.
- (Hairong Kuang via shv)
- HADOOP-2733. Fix compiler warnings in test code.
- (Tsz Wo (Nicholas), SZE via cdouglas)
- HADOOP-2725. Modify distcp to avoid leaving partially copied files at
- the destination after encountering an error. (Tsz Wo (Nicholas), SZE
- via cdouglas)
- HADOOP-2391. Cleanup job output directory before declaring a job as
- SUCCESSFUL. (Amareshwari Sri Ramadasu via ddas)
- HADOOP-2808. Minor fix to FileUtil::copy to mind the overwrite
- formal. (cdouglas)
- HADOOP-2683. Moving UGI out of the RPC Server.
- (Tsz Wo (Nicholas), SZE via shv)
- HADOOP-2814. Fix for NPE in datanode in unit test TestDataTransferProtocol.
- (Raghu Angadi via dhruba)
- HADOOP-2811. Dump of counters in job history does not add comma between
- groups. (runping via omalley)
- HADOOP-2735. Enables setting TMPDIR for tasks.
- (Amareshwari Sri Ramadasu via ddas)
- HADOOP-2843. Fix protections on map-side join classes to enable derivation.
- (cdouglas via omalley)
- HADOOP-2840. Fix gridmix scripts to correctly invoke the java sort through
- the proper jar. (Mukund Madhugiri via cdouglas)
- HADOOP-2769. TestNNThroughputBnechmark should not use a fixed port for
- the namenode http port. (omalley)
- HADOOP-2852. Update gridmix benchmark to avoid an artifically long tail.
- (cdouglas)
- HADOOP-2894. Fix a problem to do with tasktrackers failing to connect to
- JobTracker upon reinitialization. (Owen O'Malley via ddas).
- HADOOP-2903. Fix exception generated by Metrics while using pushMetric().
- (girish vaitheeswaran via dhruba)
- HADOOP-2904. Fix to RPC metrics to log the correct host name.
- (girish vaitheeswaran via dhruba)
- HADOOP-2918. Improve error logging so that dfs writes failure with
- "No lease on file" can be diagnosed. (dhruba)
- HADOOP-2923. Add SequenceFileAsBinaryInputFormat, which was
- missed in the commit for HADOOP-2603. (cdouglas via omalley)
- HADOOP-2931. IOException thrown by DFSOutputStream had wrong stack
- trace in some cases. (Michael Bieniosek via rangadi)
- HADOOP-2883. Write failures and data corruptions on HDFS files.
- The write timeout is back to what it was on 0.15 release. Also, the
- datnodes flushes the block file buffered output stream before
- sending a positive ack for the packet back to the client. (dhruba)
- HADOOP-2756. NPE in DFSClient while closing DFSOutputStreams
- under load. (rangadi)
- HADOOP-2958. Fixed FileBench which broke due to HADOOP-2391 which performs
- a check for existence of the output directory and a trivial bug in
- GenericMRLoadGenerator where min/max word lenghts were identical since
- they were looking at the same config variables (Chris Douglas via
- acmurthy)
- HADOOP-2915. Fixed FileSystem.CACHE so that a username is included
- in the cache key. (Tsz Wo (Nicholas), SZE via nigel)
- HADOOP-2813. TestDU unit test uses its own directory to run its
- sequence of tests. (Mahadev Konar via dhruba)
- Release 0.16.0 - 2008-02-07
- INCOMPATIBLE CHANGES
- HADOOP-1245. Use the mapred.tasktracker.tasks.maximum value
- configured on each tasktracker when allocating tasks, instead of
- the value configured on the jobtracker. InterTrackerProtocol
- version changed from 5 to 6. (Michael Bieniosek via omalley)
- HADOOP-1843. Removed code from Configuration and JobConf deprecated by
- HADOOP-785 and a minor fix to Configuration.toString. Specifically the
- important change is that mapred-default.xml is no longer supported and
- Configuration no longer supports the notion of default/final resources.
- (acmurthy)
- HADOOP-1302. Remove deprecated abacus code from the contrib directory.
- This also fixes a configuration bug in AggregateWordCount, so that the
- job now works. (enis)
- HADOOP-2288. Enhance FileSystem API to support access control.
- (Tsz Wo (Nicholas), SZE via dhruba)
- HADOOP-2184. RPC Support for user permissions and authentication.
- (Raghu Angadi via dhruba)
- HADOOP-2185. RPC Server uses any available port if the specified
- port is zero. Otherwise it uses the specified port. Also combines
- the configuration attributes for the servers' bind address and
- port from "x.x.x.x" and "y" to "x.x.x.x:y".
- Deprecated configuration variables:
- dfs.info.bindAddress
- dfs.info.port
- dfs.datanode.bindAddress
- dfs.datanode.port
- dfs.datanode.info.bindAdress
- dfs.datanode.info.port
- dfs.secondary.info.bindAddress
- dfs.secondary.info.port
- mapred.job.tracker.info.bindAddress
- mapred.job.tracker.info.port
- mapred.task.tracker.report.bindAddress
- tasktracker.http.bindAddress
- tasktracker.http.port
- New configuration variables (post HADOOP-2404):
- dfs.secondary.http.address
- dfs.datanode.address
- dfs.datanode.http.address
- dfs.http.address
- mapred.job.tracker.http.address
- mapred.task.tracker.report.address
- mapred.task.tracker.http.address
- (Konstantin Shvachko via dhruba)
- HADOOP-2401. Only the current leaseholder can abandon a block for
- a HDFS file. ClientProtocol version changed from 20 to 21.
- (Tsz Wo (Nicholas), SZE via dhruba)
- HADOOP-2381. Support permission information in FileStatus. Client
- Protocol version changed from 21 to 22. (Raghu Angadi via dhruba)
- HADOOP-2110. Block report processing creates fewer transient objects.
- Datanode Protocol version changed from 10 to 11.
- (Sanjay Radia via dhruba)
-
- HADOOP-2567. Add FileSystem#getHomeDirectory(), which returns the
- user's home directory in a FileSystem as a fully-qualified path.
- FileSystem#getWorkingDirectory() is also changed to return a
- fully-qualified path, which can break applications that attempt
- to, e.g., pass LocalFileSystem#getWorkingDir().toString() directly
- to java.io methods that accept file names. (cutting)
- HADOOP-2514. Change trash feature to maintain a per-user trash
- directory, named ".Trash" in the user's home directory. The
- "fs.trash.root" parameter is no longer used. Full source paths
- are also no longer reproduced within the trash.
- HADOOP-2012. Periodic data verification on Datanodes.
- (Raghu Angadi via dhruba)
- HADOOP-1707. The DFSClient does not use a local disk file to cache
- writes to a HDFS file. Changed Data Transfer Version from 7 to 8.
- (dhruba)
- HADOOP-2652. Fix permission issues for HftpFileSystem. This is an
- incompatible change since distcp may not be able to copy files
- from cluster A (compiled with this patch) to cluster B (compiled
- with previous versions). (Tsz Wo (Nicholas), SZE via dhruba)
- NEW FEATURES
- HADOOP-1857. Ability to run a script when a task fails to capture stack
- traces. (Amareshwari Sri Ramadasu via ddas)
- HADOOP-2299. Defination of a login interface. A simple implementation for
- Unix users and groups. (Hairong Kuang via dhruba)
- HADOOP-1652. A utility to balance data among datanodes in a HDFS cluster.
- (Hairong Kuang via dhruba)
- HADOOP-2085. A library to support map-side joins of consistently
- partitioned and sorted data sets. (Chris Douglas via omalley)
- HADOOP-2336. Shell commands to modify file permissions. (rangadi)
- HADOOP-1298. Implement file permissions for HDFS.
- (Tsz Wo (Nicholas) & taton via cutting)
- HADOOP-2447. HDFS can be configured to limit the total number of
- objects (inodes and blocks) in the file system. (dhruba)
- HADOOP-2487. Added an option to get statuses for all submitted/run jobs.
- This information can be used to develop tools for analysing jobs.
- (Amareshwari Sri Ramadasu via acmurthy)
- HADOOP-1873. Implement user permissions for Map/Reduce framework.
- (Hairong Kuang via shv)
- HADOOP-2532. Add to MapFile a getClosest method that returns the key
- that comes just before if the key is not present. (stack via tomwhite)
-
- HADOOP-1883. Add versioning to Record I/O. (Vivek Ratan via ddas)
- HADOOP-2603. Add SeqeunceFileAsBinaryInputFormat, which reads
- sequence files as BytesWritable/BytesWritable regardless of the
- key and value types used to write the file. (cdouglas via omalley)
- HADOOP-2367. Add ability to profile a subset of map/reduce tasks and fetch
- the result to the local filesystem of the submitting application. Also
- includes a general IntegerRanges extension to Configuration for setting
- positive, ranged parameters. (Owen O'Malley via cdouglas)
- IMPROVEMENTS
- HADOOP-2045. Change committer list on website to a table, so that
- folks can list their organization, timezone, etc. (cutting)
- HADOOP-2058. Facilitate creating new datanodes dynamically in
- MiniDFSCluster. (Hairong Kuang via dhruba)
- HADOOP-1855. fsck verifies block placement policies and reports
- violations. (Konstantin Shvachko via dhruba)
- HADOOP-1604. An system administrator can finalize namenode upgrades
- without running the cluster. (Konstantin Shvachko via dhruba)
- HADOOP-1839. Link-ify the Pending/Running/Complete/Killed grid in
- jobdetails.jsp to help quickly narrow down and see categorized TIPs'
- details via jobtasks.jsp. (Amar Kamat via acmurthy)
- HADOOP-1210. Log counters in job history. (Owen O'Malley via ddas)
- HADOOP-1912. Datanode has two new commands COPY and REPLACE. These are
- needed for supporting data rebalance. (Hairong Kuang via dhruba)
- HADOOP-2086. This patch adds the ability to add dependencies to a job
- (run via JobControl) after construction. (Adrian Woodhead via ddas)
- HADOOP-1185. Support changing the logging level of a server without
- restarting the server. (Tsz Wo (Nicholas), SZE via dhruba)
- HADOOP-2134. Remove developer-centric requirements from overview.html and
- keep it end-user focussed, specifically sections related to subversion and
- building Hadoop. (Jim Kellerman via acmurthy)
- HADOOP-1989. Support simulated DataNodes. This helps creating large virtual
- clusters for testing purposes. (Sanjay Radia via dhruba)
-
- HADOOP-1274. Support different number of mappers and reducers per
- TaskTracker to allow administrators to better configure and utilize
- heterogenous clusters.
- Configuration changes to hadoop-default.xml:
- add mapred.tasktracker.map.tasks.maximum (default value of 2)
- add mapred.tasktracker.reduce.tasks.maximum (default value of 2)
- remove mapred.tasktracker.tasks.maximum (deprecated for 0.16.0)
- (Amareshwari Sri Ramadasu via acmurthy)
- HADOOP-2104. Adds a description to the ant targets. This makes the
- output of "ant -projecthelp" sensible. (Chris Douglas via ddas)
- HADOOP-2127. Added a pipes sort example to benchmark trivial pipes
- application versus trivial java application. (omalley via acmurthy)
- HADOOP-2113. A new shell command "dfs -text" to view the contents of
- a gziped or SequenceFile. (Chris Douglas via dhruba)
- HADOOP-2207. Add a "package" target for contrib modules that
- permits each to determine what files are copied into release
- builds. (stack via cutting)
- HADOOP-1984. Makes the backoff for failed fetches exponential.
- Earlier, it was a random backoff from an interval.
- (Amar Kamat via ddas)
- HADOOP-1327. Include website documentation for streaming. (Rob Weltman
- via omalley)
- HADOOP-2000. Rewrite NNBench to measure namenode performance accurately.
- It now uses the map-reduce framework for load generation.
- (Mukund Madhugiri via dhruba)
- HADOOP-2248. Speeds up the framework w.r.t Counters. Also has API
- updates to the Counters part. (Owen O'Malley via ddas)
- HADOOP-2326. The initial block report at Datanode startup time has
- a random backoff period. (Sanjay Radia via dhruba)
- HADOOP-2432. HDFS includes the name of the file while throwing
- "File does not exist" exception. (Jim Kellerman via dhruba)
- HADOOP-2457. Added a 'forrest.home' property to the 'docs' target in
- build.xml. (acmurthy)
- HADOOP-2149. A new benchmark for three name-node operation: file create,
- open, and block report, to evaluate the name-node performance
- for optimizations or new features. (Konstantin Shvachko via shv)
- HADOOP-2466. Change FileInputFormat.computeSplitSize to a protected
- non-static method to allow sub-classes to provide alternate
- implementations. (Alejandro Abdelnur via acmurthy)
- HADOOP-2425. Change TextOutputFormat to handle Text specifically for better
- performance. Make NullWritable implement Comparable. Make TextOutputFormat
- treat NullWritable like null. (omalley)
- HADOOP-1719. Improves the utilization of shuffle copier threads.
- (Amar Kamat via ddas)
-
- HADOOP-2390. Added documentation for user-controls for intermediate
- map-outputs & final job-outputs and native-hadoop libraries. (acmurthy)
-
- HADOOP-1660. Add the cwd of the map/reduce task to the java.library.path
- of the child-jvm to support loading of native libraries distributed via
- the DistributedCache. (acmurthy)
-
- HADOOP-2285. Speeds up TextInputFormat. Also includes updates to the
- Text API. (Owen O'Malley via cdouglas)
- HADOOP-2233. Adds a generic load generator for modeling MR jobs. (cdouglas)
- HADOOP-2369. Adds a set of scripts for simulating a mix of user map/reduce
- workloads. (Runping Qi via cdouglas)
- HADOOP-2547. Removes use of a 'magic number' in build.xml.
- (Hrishikesh via nigel)
- HADOOP-2268. Fix org.apache.hadoop.mapred.jobcontrol classes to use the
- List/Map interfaces rather than concrete ArrayList/HashMap classes
- internally. (Adrian Woodhead via acmurthy)
- HADOOP-2406. Add a benchmark for measuring read/write performance through
- the InputFormat interface, particularly with compression. (cdouglas)
- HADOOP-2131. Allow finer-grained control over speculative-execution. Now
- users can set it for maps and reduces independently.
- Configuration changes to hadoop-default.xml:
- deprecated mapred.speculative.execution
- add mapred.map.tasks.speculative.execution
- add mapred.reduce.tasks.speculative.execution
- (Amareshwari Sri Ramadasu via acmurthy)
-
- HADOOP-1965. Interleave sort/spill in teh map-task along with calls to the
- Mapper.map method. This is done by splitting the 'io.sort.mb' buffer into
- two and using one half for collecting map-outputs and the other half for
- sort/spill. (Amar Kamat via acmurthy)
-
- HADOOP-2464. Unit tests for chmod, chown, and chgrp using DFS.
- (Raghu Angadi)
- HADOOP-1876. Persist statuses of completed jobs in HDFS so that the
- JobClient can query and get information about decommissioned jobs and also
- across JobTracker restarts.
- Configuration changes to hadoop-default.xml:
- add mapred.job.tracker.persist.jobstatus.active (default value of false)
- add mapred.job.tracker.persist.jobstatus.hours (default value of 0)
- add mapred.job.tracker.persist.jobstatus.dir (default value of
- /jobtracker/jobsInfo)
- (Alejandro Abdelnur via acmurthy)
- HADOOP-2077. Added version and build information to STARTUP_MSG for all
- hadoop daemons to aid error-reporting, debugging etc. (acmurthy)
- HADOOP-2398. Additional instrumentation for NameNode and RPC server.
- Add support for accessing instrumentation statistics via JMX.
- (Sanjay radia via dhruba)
- HADOOP-2449. A return of the non-MR version of NNBench.
- (Sanjay Radia via shv)
- HADOOP-1989. Remove 'datanodecluster' command from bin/hadoop.
- (Sanjay Radia via shv)
- HADOOP-1742. Improve JavaDoc documentation for ClientProtocol, DFSClient,
- and FSNamesystem. (Konstantin Shvachko)
- HADOOP-2298. Add Ant target for a binary-only distribution.
- (Hrishikesh via nigel)
- HADOOP-2509. Add Ant target for Rat report (Apache license header
- reports). (Hrishikesh via nigel)
- HADOOP-2469. WritableUtils.clone should take a Configuration
- instead of a JobConf. (stack via omalley)
- HADOOP-2659. Introduce superuser permissions for admin operations.
- (Tsz Wo (Nicholas), SZE via shv)
- HADOOP-2596. Added a SequenceFile.createWriter api which allows the user
- to specify the blocksize, replication factor and the buffersize to be
- used for the underlying HDFS file. (Alejandro Abdelnur via acmurthy)
- HADOOP-2431. Test HDFS File Permissions. (Hairong Kuang via shv)
- HADOOP-2232. Add an option to disable Nagle's algorithm in the IPC stack.
- (Clint Morgan via cdouglas)
- HADOOP-2342. Created a micro-benchmark for measuring
- local-file versus hdfs reads. (Owen O'Malley via nigel)
- HADOOP-2529. First version of HDFS User Guide. (Raghu Angadi)
- HADOOP-2690. Add jar-test target to build.xml, separating compilation
- and packaging of the test classes. (Enis Soztutar via cdouglas)
- OPTIMIZATIONS
- HADOOP-1898. Release the lock protecting the last time of the last stack
- dump while the dump is happening. (Amareshwari Sri Ramadasu via omalley)
- HADOOP-1900. Makes the heartbeat and task event queries interval
- dependent on the cluster size. (Amareshwari Sri Ramadasu via ddas)
- HADOOP-2208. Counter update frequency (from TaskTracker to JobTracker) is
- capped at 1 minute. (Amareshwari Sri Ramadasu via ddas)
- HADOOP-2284. Reduce the number of progress updates during the sorting in
- the map task. (Amar Kamat via ddas)
- BUG FIXES
- HADOOP-2583. Fixes a bug in the Eclipse plug-in UI to edit locations.
- Plug-in version is now synchronized with Hadoop version.
- HADOOP-2100. Remove faulty check for existence of $HADOOP_PID_DIR and let
- 'mkdir -p' check & create it. (Michael Bieniosek via acmurthy)
- HADOOP-1642. Ensure jobids generated by LocalJobRunner are unique to
- avoid collissions and hence job-failures. (Doug Cutting via acmurthy)
- HADOOP-2096. Close open file-descriptors held by streams while localizing
- job.xml in the JobTracker and while displaying it on the webui in
- jobconf.jsp. (Amar Kamat via acmurthy)
- HADOOP-2098. Log start & completion of empty jobs to JobHistory, which
- also ensures that we close the file-descriptor of the job's history log
- opened during job-submission. (Amar Kamat via acmurthy)
- HADOOP-2112. Adding back changes to build.xml lost while reverting
- HADOOP-1622 i.e. http://svn.apache.org/viewvc?view=rev&revision=588771.
- (acmurthy)
- HADOOP-2089. Fixes the command line argument handling to handle multiple
- -cacheArchive in Hadoop streaming. (Lohit Vijayarenu via ddas)
- HADOOP-2071. Fix StreamXmlRecordReader to use a BufferedInputStream
- wrapped over the DFSInputStream since mark/reset aren't supported by
- DFSInputStream anymore. (Lohit Vijayarenu via acmurthy)
- HADOOP-1348. Allow XML comments inside configuration files.
- (Rajagopal Natarajan and Enis Soztutar via enis)
- HADOOP-1952. Improve handling of invalid, user-specified classes while
- configuring streaming jobs such as combiner, input/output formats etc.
- Now invalid options are caught, logged and jobs are failed early. (Lohit
- Vijayarenu via acmurthy)
- HADOOP-2151. FileSystem.globPaths validates the list of Paths that
- it returns. (Lohit Vijayarenu via dhruba)
- HADOOP-2121. Cleanup DFSOutputStream when the stream encountered errors
- when Datanodes became full. (Raghu Angadi via dhruba)
- HADOOP-1130. The FileSystem.closeAll() method closes all existing
- DFSClients. (Chris Douglas via dhruba)
- HADOOP-2204. DFSTestUtil.waitReplication was not waiting for all replicas
- to get created, thus causing unit test failure.
- (Raghu Angadi via dhruba)
- HADOOP-2078. An zero size file may have no blocks associated with it.
- (Konstantin Shvachko via dhruba)
- HADOOP-2212. ChecksumFileSystem.getSumBufferSize might throw
- java.lang.ArithmeticException. The fix is to initialize bytesPerChecksum
- to 0. (Michael Bieniosek via ddas)
- HADOOP-2216. Fix jobtasks.jsp to ensure that it first collects the
- taskids which satisfy the filtering criteria and then use that list to
- print out only the required task-reports, previously it was oblivious to
- the filtering and hence used the wrong index into the array of task-reports.
- (Amar Kamat via acmurthy)
- HADOOP-2272. Fix findbugs target to reflect changes made to the location
- of the streaming jar file by HADOOP-2207. (Adrian Woodhead via nigel)
- HADOOP-2244. Fixes the MapWritable.readFields to clear the instance
- field variable every time readFields is called. (Michael Stack via ddas).
- HADOOP-2245. Fixes LocalJobRunner to include a jobId in the mapId. Also,
- adds a testcase for JobControl. (Adrian Woodhead via ddas).
- HADOOP-2275. Fix erroneous detection of corrupted file when namenode
- fails to allocate any datanodes for newly allocated block.
- (Dhruba Borthakur via dhruba)
- HADOOP-2256. Fix a buf in the namenode that could cause it to encounter
- an infinite loop while deleting excess replicas that were created by
- block rebalancing. (Hairong Kuang via dhruba)
- HADOOP-2209. SecondaryNamenode process exits if it encounters exceptions
- that it cannot handle. (Dhruba Borthakur via dhruba)
- HADOOP-2314. Prevent TestBlockReplacement from occasionally getting
- into an infinite loop. (Hairong Kuang via dhruba)
- HADOOP-2300. This fixes a bug where mapred.tasktracker.tasks.maximum
- would be ignored even if it was set in hadoop-site.xml.
- (Amareshwari Sri Ramadasu via ddas)
- HADOOP-2349. Improve code layout in file system transaction logging code.
- (Tsz Wo (Nicholas), SZE via dhruba)
- HADOOP-2368. Fix unit tests on Windows.
- (Tsz Wo (Nicholas), SZE via dhruba)
- HADOOP-2363. This fix allows running multiple instances of the unit test
- in parallel. The bug was introduced in HADOOP-2185 that changed
- port-rolling behaviour. (Konstantin Shvachko via dhruba)
- HADOOP-2271. Fix chmod task to be non-parallel. (Adrian Woodhead via
- omalley)
- HADOOP-2313. Fail the build if building libhdfs fails. (nigel via omalley)
- HADOOP-2359. Remove warning for interruptted exception when closing down
- minidfs. (dhruba via omalley)
- HADOOP-1841. Prevent slow clients from consuming threads in the NameNode.
- (dhruba)
-
- HADOOP-2323. JobTracker.close() should not print stack traces for
- normal exit. (jimk via cutting)
- HADOOP-2376. Prevents sort example from overriding the number of maps.
- (Owen O'Malley via ddas)
- HADOOP-2434. FSDatasetInterface read interface causes HDFS reads to occur
- in 1 byte chunks, causing performance degradation.
- (Raghu Angadi via dhruba)
- HADOOP-2459. Fix package target so that src/docs/build files are not
- included in the release. (nigel)
- HADOOP-2215. Fix documentation in cluster_setup.html &
- mapred_tutorial.html reflect that mapred.tasktracker.tasks.maximum has
- been superceeded by mapred.tasktracker.{map|reduce}.tasks.maximum.
- (Amareshwari Sri Ramadasu via acmurthy)
- HADOOP-2459. Fix package target so that src/docs/build files are not
- included in the release. (nigel)
- HADOOP-2352. Remove AC_CHECK_LIB for libz and liblzo to ensure that
- libhadoop.so doesn't have a dependency on them. (acmurthy)
- HADOOP-2453. Fix the configuration for wordcount-simple example in Hadoop
- Pipes which currently produces an XML parsing error. (Amareshwari Sri
- Ramadasu via acmurthy)
- HADOOP-2476. Unit test failure while reading permission bits of local
- file system (on Windows) fixed. (Raghu Angadi via dhruba)
- HADOOP-2247. Fine-tune the strategies for killing mappers and reducers
- due to failures while fetching map-outputs. Now the map-completion times
- and number of currently running reduces are taken into account by the
- JobTracker before killing the mappers, while the progress made by the
- reducer and the number of fetch-failures vis-a-vis total number of
- fetch-attempts are taken into account before teh reducer kills itself.
- (Amar Kamat via acmurthy)
-
- HADOOP-2452. Fix eclipse plug-in build.xml to refers to the right
- location where hadoop-*-core.jar is generated. (taton)
- HADOOP-2492. Additional debugging in the rpc server to better
- diagnose ConcurrentModificationException. (dhruba)
- HADOOP-2344. Enhance the utility for executing shell commands to read the
- stdout/stderr streams while waiting for the command to finish (to free up
- the buffers). Also, this patch throws away stderr of the DF utility.
- @deprecated
- org.apache.hadoop.fs.ShellCommand for org.apache.hadoop.util.Shell
- org.apache.hadoop.util.ShellUtil for
- org.apache.hadoop.util.Shell.ShellCommandExecutor
- (Amar Kamat via acmurthy)
- HADOOP-2511. Fix a javadoc warning in org.apache.hadoop.util.Shell
- introduced by HADOOP-2344. (acmurthy)
- HADOOP-2442. Fix TestLocalFileSystemPermission.testLocalFSsetOwner
- to work on more platforms. (Raghu Angadi via nigel)
- HADOOP-2488. Fix a regression in random read performance.
- (Michael Stack via rangadi)
- HADOOP-2523. Fix TestDFSShell.testFilePermissions on Windows.
- (Raghu Angadi via nigel)
- HADOOP-2535. Removed support for deprecated mapred.child.heap.size and
- fixed some indentation issues in TaskRunner. (acmurthy)
- Configuration changes to hadoop-default.xml:
- remove mapred.child.heap.size
- HADOOP-2512. Fix error stream handling in Shell. Use exit code to
- detect shell command errors in RawLocalFileSystem. (Raghu Angadi)
- HADOOP-2446. Fixes TestHDFSServerPorts and TestMRServerPorts so they
- do not rely on statically configured ports and cleanup better. (nigel)
- HADOOP-2537. Make build process compatible with Ant 1.7.0.
- (Hrishikesh via nigel)
- HADOOP-1281. Ensure running tasks of completed map TIPs (e.g. speculative
- tasks) are killed as soon as the TIP completed. (acmurthy)
- HADOOP-2571. Suppress a suprious warning in test code. (cdouglas)
- HADOOP-2481. NNBench report its progress periodically.
- (Hairong Kuang via dhruba)
- HADOOP-2601. Start name-node on a free port for TestNNThroughputBenchmark.
- (Konstantin Shvachko)
- HADOOP-2494. Set +x on contrib/*/bin/* in packaged tar bundle.
- (stack via tomwhite)
- HADOOP-2605. Remove bogus leading slash in task-tracker report bindAddress.
- (Konstantin Shvachko)
-
- HADOOP-2620. Trivial. 'bin/hadoop fs -help' did not list chmod, chown, and
- chgrp. (Raghu Angadi)
- HADOOP-2614. The DFS WebUI accesses are configured to be from the user
- specified by dfs.web.ugi. (Tsz Wo (Nicholas), SZE via dhruba)
- HADOOP-2543. Implement a "no-permission-checking" mode for smooth
- upgrade from a pre-0.16 install of HDFS.
- (Hairong Kuang via dhruba)
- HADOOP-290. A DataNode log message now prints the target of a replication
- request correctly. (dhruba)
- HADOOP-2538. Redirect to a warning, if plaintext parameter is true but
- the filter parameter is not given in TaskLogServlet.
- (Michael Bieniosek via enis)
- HADOOP-2582. Prevent 'bin/hadoop fs -copyToLocal' from creating
- zero-length files when the src does not exist.
- (Lohit Vijayarenu via cdouglas)
- HADOOP-2189. Incrementing user counters should count as progress. (ddas)
- HADOOP-2649. The NameNode periodically computes replication work for
- the datanodes. The periodicity of this computation is now configurable.
- (dhruba)
- HADOOP-2549. Correct disk size computation so that data-nodes could switch
- to other local drives if current is full. (Hairong Kuang via shv)
- HADOOP-2633. Fsck should call name-node methods directly rather than
- through rpc. (Tsz Wo (Nicholas), SZE via shv)
- HADOOP-2687. Modify a few log message generated by dfs client to be
- logged only at INFO level. (stack via dhruba)
- HADOOP-2402. Fix BlockCompressorStream to ensure it buffers data before
- sending it down to the compressor so that each write call doesn't
- compress. (Chris Douglas via acmurthy)
- HADOOP-2645. The Metrics initialization code does not throw
- exceptions when servers are restarted by MiniDFSCluster.
- (Sanjay Radia via dhruba)
- HADOOP-2691. Fix a race condition that was causing the DFSClient
- to erroneously remove a good datanode from a pipeline that actually
- had another datanode that was bad. (dhruba)
- HADOOP-1195. All code in FSNamesystem checks the return value
- of getDataNode for null before using it. (dhruba)
- HADOOP-2640. Fix a bug in MultiFileSplitInputFormat that was always
- returning 1 split in some circumstances. (Enis Soztutar via nigel)
- HADOOP-2626. Fix paths with special characters to work correctly
- with the local filesystem. (Thomas Friol via cutting)
- HADOOP-2646. Fix SortValidator to work with fully-qualified
- working directories. (Arun C Murthy via nigel)
- HADOOP-2092. Added a ping mechanism to the pipes' task to periodically
- check if the parent Java task is running, and exit if the parent isn't
- alive and responding. (Amareshwari Sri Ramadasu via acmurthy)
- HADOOP-2714. TestDecommission failed on windows because the replication
- request was timing out. (dhruba)
- HADOOP-2576. Namenode performance degradation over time triggered by
- large heartbeat interval. (Raghu Angadi)
- HADOOP-2713. TestDatanodeDeath failed on windows because the replication
- request was timing out. (dhruba)
- HADOOP-2639. Fixes a problem to do with incorrect maintenance of values
- for runningMapTasks/runningReduceTasks. (Amar Kamat and Arun Murthy
- via ddas)
- HADOOP-2723. Fixed the check for checking whether to do user task
- profiling. (Amareshwari Sri Ramadasu via omalley)
- HADOOP-2734. Link forrest docs to new http://hadoop.apache.org
- (Doug Cutting via nigel)
- HADOOP-2641. Added Apache license headers to 95 files. (nigel)
- HADOOP-2732. Fix bug in path globbing. (Hairong Kuang via nigel)
- HADOOP-2404. Fix backwards compatability with hadoop-0.15 configuration
- files that was broken by HADOOP-2185. (omalley)
- HADOOP-2755. Fix fsck performance degradation because of permissions
- issue. (Tsz Wo (Nicholas), SZE via dhruba)
- HADOOP-2768. Fix performance regression caused by HADOOP-1707.
- (dhruba borthakur via nigel)
- HADOOP-3108. Fix NPE in setPermission and setOwner. (shv)
- Release 0.15.3 - 2008-01-18
- BUG FIXES
- HADOOP-2562. globPaths supports {ab,cd}. (Hairong Kuang via dhruba)
- HADOOP-2540. fsck reports missing blocks incorrectly. (dhruba)
- HADOOP-2570. "work" directory created unconditionally, and symlinks
- created from the task cwds.
- HADOOP-2574. Fixed mapred_tutorial.xml to correct minor errors with the
- WordCount examples. (acmurthy)
- Release 0.15.2 - 2008-01-02
- BUG FIXES
- HADOOP-2246. Moved the changelog for HADOOP-1851 from the NEW FEATURES
- section to the INCOMPATIBLE CHANGES section. (acmurthy)
- HADOOP-2238. Fix TaskGraphServlet so that it sets the content type of
- the response appropriately. (Paul Saab via enis)
- HADOOP-2129. Fix so that distcp works correctly when source is
- HDFS but not the default filesystem. HDFS paths returned by the
- listStatus() method are now fully-qualified. (cutting)
- HADOOP-2378. Fixes a problem where the last task completion event would
- get created after the job completes. (Alejandro Abdelnur via ddas)
- HADOOP-2228. Checks whether a job with a certain jobId is already running
- and then tries to create the JobInProgress object.
- (Johan Oskarsson via ddas)
- HADOOP-2422. dfs -cat multiple files fail with 'Unable to write to
- output stream'. (Raghu Angadi via dhruba)
- HADOOP-2460. When the namenode encounters ioerrors on writing a
- transaction log, it stops writing new transactions to that one.
- (Raghu Angadi via dhruba)
- HADOOP-2227. Use the LocalDirAllocator uniformly for handling all of the
- temporary storage required for a given task. It also implies that
- mapred.local.dir.minspacestart is handled by checking if there is enough
- free-space on any one of the available disks. (Amareshwari Sri Ramadasu
- via acmurthy)
- HADOOP-2437. Fix the LocalDirAllocator to choose the seed for the
- round-robin disk selections randomly. This helps in spreading data across
- multiple partitions much better. (acmurhty)
- HADOOP-2486. When the list of files from the InMemoryFileSystem is obtained
- for merging, this patch will ensure that only those files whose checksums
- have also got created (renamed) are returned. (ddas)
- HADOOP-2456. Hardcode English locale to prevent NumberFormatException
- from occurring when starting the NameNode with certain locales.
- (Matthias Friedrich via nigel)
- IMPROVEMENTS
- HADOOP-2160. Remove project-level, non-user documentation from
- releases, since it's now maintained in a separate tree. (cutting)
- HADOOP-1327. Add user documentation for streaming. (cutting)
- HADOOP-2382. Add hadoop-default.html to subversion. (cutting)
- HADOOP-2158. hdfsListDirectory calls FileSystem.listStatus instead
- of FileSystem.listPaths. This reduces the number of RPC calls on the
- namenode, thereby improving scalability. (Christian Kunz via dhruba)
- Release 0.15.1 - 2007-11-27
- INCOMPATIBLE CHANGES
- HADOOP-713. Reduce CPU usage on namenode while listing directories.
- FileSystem.listPaths does not return the size of the entire subtree.
- Introduced a new API ClientProtocol.getContentLength that returns the
- size of the subtree. (Dhruba Borthakur via dhruba)
- IMPROVEMENTS
- HADOOP-1917. Addition of guides/tutorial for better overall
- documentation for Hadoop. Specifically:
- * quickstart.html is targetted towards first-time users and helps them
- setup a single-node cluster and play with Hadoop.
- * cluster_setup.html helps admins to configure and setup non-trivial
- hadoop clusters.
- * mapred_tutorial.html is a comprehensive Map-Reduce tutorial.
- (acmurthy)
- BUG FIXES
- HADOOP-2174. Removed the unnecessary Reporter.setStatus call from
- FSCopyFilesMapper.close which led to a NPE since the reporter isn't valid
- in the close method. (Chris Douglas via acmurthy)
- HADOOP-2172. Restore performance of random access to local files
- by caching positions of local input streams, avoiding a system
- call. (cutting)
- HADOOP-2205. Regenerate the Hadoop website since some of the changes made
- by HADOOP-1917 weren't correctly copied over to the trunk/docs directory.
- Also fixed a couple of minor typos and broken links. (acmurthy)
- Release 0.15.0 - 2007-11-2
- INCOMPATIBLE CHANGES
- HADOOP-1708. Make files appear in namespace as soon as they are
- created. (Dhruba Borthakur via dhruba)
- HADOOP-999. A HDFS Client immediately informs the NameNode of a new
- file creation. ClientProtocol version changed from 14 to 15.
- (Tsz Wo (Nicholas), SZE via dhruba)
- HADOOP-932. File locking interfaces and implementations (that were
- earlier deprecated) are removed. Client Protocol version changed
- from 15 to 16. (Raghu Angadi via dhruba)
- HADOOP-1621. FileStatus is now a concrete class and FileSystem.listPaths
- is deprecated and replaced with listStatus. (Chris Douglas via omalley)
- HADOOP-1656. The blockSize of a file is stored persistently in the file
- inode. (Dhruba Borthakur via dhruba)
- HADOOP-1838. The blocksize of files created with an earlier release is
- set to the default block size. (Dhruba Borthakur via dhruba)
- HADOOP-785. Add support for 'final' Configuration parameters,
- removing support for 'mapred-default.xml', and changing
- 'hadoop-site.xml' to not override other files. Now folks should
- generally use 'hadoop-site.xml' for all configurations. Values
- with a 'final' tag may not be overridden by subsequently loaded
- configuration files, e.g., by jobs. (Arun C. Murthy via cutting)
- HADOOP-1846. DatanodeReport in ClientProtocol can report live
- datanodes, dead datanodes or all datanodes. Client Protocol version
- changed from 17 to 18. (Hairong Kuang via dhruba)
- HADOOP-1851. Permit specification of map output compression type
- and codec, independent of the final output's compression
- parameters. (Arun C Murthy via cutting)
- HADOOP-1819. Jobtracker cleanups, including binding ports before
- clearing state directories, so that inadvertently starting a
- second jobtracker doesn't trash one that's already running. Removed
- method JobTracker.getTracker() because the static variable, which
- stored the value caused initialization problems.
- (omalley via cutting)
- NEW FEATURES
- HADOOP-89. A client can access file data even before the creator
- has closed the file. Introduce a new command "tail" from dfs shell.
- (Dhruba Borthakur via dhruba)
- HADOOP-1636. Allow configuration of the number of jobs kept in
- memory by the JobTracker. (Michael Bieniosek via omalley)
- HADOOP-1667. Reorganize CHANGES.txt into sections to make it
- easier to read. Also remove numbering, to make merging easier.
- (cutting)
- HADOOP-1610. Add metrics for failed tasks.
- (Devaraj Das via tomwhite)
- HADOOP-1767. Add "bin/hadoop job -list" sub-command. (taton via cutting)
- HADOOP-1351. Add "bin/hadoop job [-fail-task|-kill-task]" sub-commands
- to terminate a particular task-attempt. (Enis Soztutar via acmurthy)
- HADOOP-1880. SleepJob : An example job that sleeps at each map and
- reduce task. (enis)
- HADOOP-1809. Add a link in web site to #hadoop IRC channel. (enis)
- HADOOP-1894. Add percentage graphs and mapred task completion graphs
- to Web User Interface. Users not using Firefox may install a plugin to
- their browsers to see svg graphics. (enis)
- HADOOP-1914. Introduce a new NamenodeProtocol to allow secondary
- namenodes and rebalancing processes to communicate with a primary
- namenode. (Hairong Kuang via dhruba)
- HADOOP-1963. Add a FileSystem implementation for the Kosmos
- Filesystem (KFS). (Sriram Rao via cutting)
- HADOOP-1822. Allow the specialization and configuration of socket
- factories. Provide a StandardSocketFactory, and a SocksSocketFactory to
- allow the use of SOCKS proxies. (taton).
- HADOOP-1968. FileSystem supports wildcard input syntax "{ }".
- (Hairong Kuang via dhruba)
- HADOOP-2566. Add globStatus method to the FileSystem interface
- and deprecate globPath and listPath. (Hairong Kuang via hairong)
- OPTIMIZATIONS
- HADOOP-1910. Reduce the number of RPCs that DistributedFileSystem.create()
- makes to the namenode. (Raghu Angadi via dhruba)
- HADOOP-1565. Reduce memory usage of NameNode by replacing
- TreeMap in HDFS Namespace with ArrayList.
- (Dhruba Borthakur via dhruba)
- HADOOP-1743. Change DFS INode from a nested class to standalone
- class, with specialized subclasses for directories and files, to
- save memory on the namenode. (Konstantin Shvachko via cutting)
- HADOOP-1759. Change file name in INode from String to byte[],
- saving memory on the namenode. (Konstantin Shvachko via cutting)
- HADOOP-1766. Save memory in namenode by having BlockInfo extend
- Block, and replace many uses of Block with BlockInfo.
- (Konstantin Shvachko via cutting)
- HADOOP-1687. Save memory in namenode by optimizing BlockMap
- representation. (Konstantin Shvachko via cutting)
- HADOOP-1774. Remove use of INode.parent in Block CRC upgrade.
- (Raghu Angadi via dhruba)
- HADOOP-1788. Increase the buffer size on the Pipes command socket.
- (Amareshwari Sri Ramadasu and Christian Kunz via omalley)
- BUG FIXES
- HADOOP-1946. The Datanode code does not need to invoke du on
- every heartbeat. (Hairong Kuang via dhruba)
- HADOOP-1935. Fix a NullPointerException in internalReleaseCreate.
- (Dhruba Borthakur)
- HADOOP-1933. The nodes listed in include and exclude files
- are always listed in the datanode report.
- (Raghu Angadi via dhruba)
- HADOOP-1953. The job tracker should wait beteween calls to try and delete
- the system directory (Owen O'Malley via devaraj)
- HADOOP-1932. TestFileCreation fails with message saying filestatus.dat
- is of incorrect size. (Dhruba Borthakur via dhruba)
- HADOOP-1573. Support for 0 reducers in PIPES.
- (Owen O'Malley via devaraj)
- HADOOP-1500. Fix typographical errors in the DFS WebUI.
- (Nigel Daley via dhruba)
- HADOOP-1076. Periodic checkpoint can continue even if an earlier
- checkpoint encountered an error. (Dhruba Borthakur via dhruba)
- HADOOP-1887. The Namenode encounters an ArrayIndexOutOfBoundsException
- while listing a directory that had a file that was
- being actively written to. (Dhruba Borthakur via dhruba)
- HADOOP-1904. The Namenode encounters an exception because the
- list of blocks per datanode-descriptor was corrupted.
- (Konstantin Shvachko via dhruba)
- HADOOP-1762. The Namenode fsimage does not contain a list of
- Datanodes. (Raghu Angadi via dhruba)
- HADOOP-1890. Removed debugging prints introduced by HADOOP-1774.
- (Raghu Angadi via dhruba)
- HADOOP-1763. Too many lost task trackers on large clusters due to
- insufficient number of RPC handler threads on the JobTracker.
- (Devaraj Das)
- HADOOP-1463. HDFS report correct usage statistics for disk space
- used by HDFS. (Hairong Kuang via dhruba)
- HADOOP-1692. In DFS ant task, don't cache the Configuration.
- (Chris Douglas via cutting)
- HADOOP-1726. Remove lib/jetty-ext/ant.jar. (omalley)
- HADOOP-1772. Fix hadoop-daemon.sh script to get correct hostname
- under Cygwin. (Tsz Wo (Nicholas), SZE via cutting)
- HADOOP-1749. Change TestDFSUpgrade to sort files, fixing sporadic
- test failures. (Enis Soztutar via cutting)
- HADOOP-1748. Fix tasktracker to be able to launch tasks when log
- directory is relative. (omalley via cutting)
- HADOOP-1775. Fix a NullPointerException and an
- IllegalArgumentException in MapWritable.
- (Jim Kellerman via cutting)
- HADOOP-1795. Fix so that jobs can generate output file names with
- special characters. (Fr??d??ric Bertin via cutting)
- HADOOP-1810. Fix incorrect value type in MRBench (SmallJobs)
- (Devaraj Das via tomwhite)
- HADOOP-1806. Fix ant task to compile again, also fix default
- builds to compile ant tasks. (Chris Douglas via cutting)
- HADOOP-1758. Fix escape processing in librecordio to not be
- quadratic. (Vivek Ratan via cutting)
- HADOOP-1817. Fix MultiFileSplit to read and write the split
- length, so that it is not always zero in map tasks.
- (Thomas Friol via cutting)
- HADOOP-1853. Fix contrib/streaming to accept multiple -cacheFile
- options. (Prachi Gupta via cutting)
- HADOOP-1818. Fix MultiFileInputFormat so that it does not return
- empty splits when numPaths < numSplits. (Thomas Friol via enis)
- HADOOP-1840. Fix race condition which leads to task's diagnostic
- messages getting lost. (acmurthy)
- HADOOP-1885. Fix race condition in MiniDFSCluster shutdown.
- (Chris Douglas via nigel)
- HADOOP-1889. Fix path in EC2 scripts for building your own AMI.
- (tomwhite)
- HADOOP-1892. Fix a NullPointerException in the JobTracker when
- trying to fetch a task's diagnostic messages from the JobClient.
- (Amar Kamat via acmurthy)
- HADOOP-1897. Completely remove about.html page from the web site.
- (enis)
- HADOOP-1907. Fix null pointer exception when getting task diagnostics
- in JobClient. (Christian Kunz via omalley)
- HADOOP-1882. Remove spurious asterisks from decimal number displays.
- (Raghu Angadi via cutting)
- HADOOP-1783. Make S3 FileSystem return Paths fully-qualified with
- scheme and host. (tomwhite)
- HADOOP-1925. Make pipes' autoconf script look for libsocket and libnsl, so
- that it can compile under Solaris. (omalley)
- HADOOP-1940. TestDFSUpgradeFromImage must shut down its MiniDFSCluster.
- (Chris Douglas via nigel)
- HADOOP-1930. Fix the blame for failed fetchs on the right host. (Arun C.
- Murthy via omalley)
- HADOOP-1934. Fix the platform name on Mac to use underscores rather than
- spaces. (omalley)
- HADOOP-1959. Use "/" instead of File.separator in the StatusHttpServer.
- (jimk via omalley)
- HADOOP-1626. Improve dfsadmin help messages.
- (Lohit Vijayarenu via dhruba)
- HADOOP-1695. The SecondaryNamenode waits for the Primary NameNode to
- start up. (Dhruba Borthakur)
- HADOOP-1983. Have Pipes flush the command socket when progress is sent
- to prevent timeouts during long computations. (omalley)
- HADOOP-1875. Non-existant directories or read-only directories are
- filtered from dfs.client.buffer.dir. (Hairong Kuang via dhruba)
- HADOOP-1992. Fix the performance degradation in the sort validator.
- (acmurthy via omalley)
- HADOOP-1874. Move task-outputs' promotion/discard to a separate thread
- distinct from the main heartbeat-processing thread. The main upside being
- that we do not lock-up the JobTracker during HDFS operations, which
- otherwise may lead to lost tasktrackers if the NameNode is unresponsive.
- (Devaraj Das via acmurthy)
- HADOOP-2026. Namenode prints out one log line for "Number of transactions"
- at most once every minute. (Dhruba Borthakur)
- HADOOP-2022. Ensure that status information for successful tasks is correctly
- recorded at the JobTracker, so that, for example, one may view correct
- information via taskdetails.jsp. This bug was introduced by HADOOP-1874.
- (Amar Kamat via acmurthy)
-
- HADOOP-2031. Correctly maintain the taskid which takes the TIP to
- completion, failing which the case of lost tasktrackers isn't handled
- properly i.e. the map TIP is incorrectly left marked as 'complete' and it
- is never rescheduled elsewhere, leading to hung reduces.
- (Devaraj Das via acmurthy)
- HADOOP-2018. The source datanode of a data transfer waits for
- a response from the target datanode before closing the data stream.
- (Hairong Kuang via dhruba)
-
- HADOOP-2023. Disable TestLocalDirAllocator on Windows.
- (Hairong Kuang via nigel)
- HADOOP-2016. Ignore status-updates from FAILED/KILLED tasks at the
- TaskTracker. This fixes a race-condition which caused the tasks to wrongly
- remain in the RUNNING state even after being killed by the JobTracker and
- thus handicap the cleanup of the task's output sub-directory. (acmurthy)
- HADOOP-1771. Fix a NullPointerException in streaming caused by an
- IOException in MROutputThread. (lohit vijayarenu via nigel)
- HADOOP-2028. Fix distcp so that the log dir does not need to be
- specified and the destination does not need to exist.
- (Chris Douglas via nigel)
- HADOOP-2044. The namenode protects all lease manipulations using a
- sortedLease lock. (Dhruba Borthakur)
- HADOOP-2051. The TaskCommit thread should not die for exceptions other
- than the InterruptedException. This behavior is there for the other long
- running threads in the JobTracker. (Arun C Murthy via ddas)
- HADOOP-1973. The FileSystem object would be accessed on the JobTracker
- through a RPC in the InterTrackerProtocol. The check for the object being
- null was missing and hence NPE would be thrown sometimes. This issue fixes
- that problem. (Amareshwari Sri Ramadasu via ddas)
- HADOOP-2033. The SequenceFile.Writer.sync method was a no-op, which caused
- very uneven splits for applications like distcp that count on them.
- (omalley)
- HADOOP-2070. Added a flush method to pipes' DownwardProtocol and call
- that before waiting for the application to finish to ensure all buffered
- data is flushed. (Owen O'Malley via acmurthy)
- HADOOP-2080. Fixed calculation of the checksum file size when the values
- are large. (omalley)
- HADOOP-2048. Change error handling in distcp so that each map copies
- as much as possible before reporting the error. Also report progress on
- every copy. (Chris Douglas via omalley)
- HADOOP-2073. Change size of VERSION file after writing contents to it.
- (Konstantin Shvachko via dhruba)
-
- HADOOP-2102. Fix the deprecated ToolBase to pass its Configuration object
- to the superceding ToolRunner to ensure it picks up the appropriate
- configuration resources. (Dennis Kubes and Enis Soztutar via acmurthy)
-
- HADOOP-2103. Fix minor javadoc bugs introduce by HADOOP-2046. (Nigel
- Daley via acmurthy)
- IMPROVEMENTS
- HADOOP-1908. Restructure data node code so that block sending and
- receiving are seperated from data transfer header handling.
- (Hairong Kuang via dhruba)
- HADOOP-1921. Save the configuration of completed/failed jobs and make them
- available via the web-ui. (Amar Kamat via devaraj)
- HADOOP-1266. Remove dependency of package org.apache.hadoop.net on
- org.apache.hadoop.dfs. (Hairong Kuang via dhruba)
- HADOOP-1779. Replace INodeDirectory.getINode() by a getExistingPathINodes()
- to allow the retrieval of all existing INodes along a given path in a
- single lookup. This facilitates removal of the 'parent' field in the
- inode. (Christophe Taton via dhruba)
- HADOOP-1756. Add toString() to some Writable-s. (ab)
- HADOOP-1727. New classes: MapWritable and SortedMapWritable.
- (Jim Kellerman via ab)
- HADOOP-1651. Improve progress reporting.
- (Devaraj Das via tomwhite)
- HADOOP-1595. dfsshell can wait for a file to achieve its intended
- replication target. (Tsz Wo (Nicholas), SZE via dhruba)
- HADOOP-1693. Remove un-needed log fields in DFS replication classes,
- since the log may be accessed statically. (Konstantin Shvachko via cutting)
- HADOOP-1231. Add generics to Mapper and Reducer interfaces.
- (tomwhite via cutting)
- HADOOP-1436. Improved command-line APIs, so that all tools need
- not subclass ToolBase, and generic parameter parser is public.
- (Enis Soztutar via cutting)
- HADOOP-1703. DFS-internal code cleanups, removing several uses of
- the obsolete UTF8. (Christophe Taton via cutting)
- HADOOP-1731. Add Hadoop's version to contrib jar file names.
- (cutting)
- HADOOP-1689. Make shell scripts more portable. All shell scripts
- now explicitly depend on bash, but do not require that bash be
- installed in a particular location, as long as it is on $PATH.
- (cutting)
- HADOOP-1744. Remove many uses of the deprecated UTF8 class from
- the HDFS namenode. (Christophe Taton via cutting)
- HADOOP-1654. Add IOUtils class, containing generic io-related
- utility methods. (Enis Soztutar via cutting)
- HADOOP-1158. Change JobTracker to record map-output transmission
- errors and use them to trigger speculative re-execution of tasks.
- (Arun C Murthy via cutting)
- HADOOP-1601. Change GenericWritable to use ReflectionUtils for
- instance creation, avoiding classloader issues, and to implement
- Configurable. (Enis Soztutar via cutting)
- HADOOP-1750. Log standard output and standard error when forking
- task processes. (omalley via cutting)
- HADOOP-1803. Generalize build.xml to make files in all
- src/contrib/*/bin directories executable. (stack via cutting)
- HADOOP-1739. Let OS always choose the tasktracker's umbilical
- port. Also switch default address for umbilical connections to
- loopback. (cutting)
- HADOOP-1812. Let OS choose ports for IPC and RPC unit tests. (cutting)
- HADOOP-1825. Create $HADOOP_PID_DIR when it does not exist.
- (Michael Bieniosek via cutting)
- HADOOP-1425. Replace uses of ToolBase with the Tool interface.
- (Enis Soztutar via cutting)
- HADOOP-1569. Reimplement DistCP to use the standard FileSystem/URI
- code in Hadoop so that you can copy from and to all of the supported file
- systems.(Chris Douglas via omalley)
- HADOOP-1018. Improve documentation w.r.t handling of lost hearbeats between
- TaskTrackers and JobTracker. (acmurthy)
- HADOOP-1718. Add ant targets for measuring code coverage with clover.
- (simonwillnauer via nigel)
- HADOOP-1592. Log error messages to the client console when tasks
- fail. (Amar Kamat via cutting)
- HADOOP-1879. Remove some unneeded casts. (Nilay Vaish via cutting)
- HADOOP-1878. Add space between priority links on job details
- page. (Thomas Friol via cutting)
- HADOOP-120. In ArrayWritable, prevent creation with null value
- class, and improve documentation. (Cameron Pope via cutting)
- HADOOP-1926. Add a random text writer example/benchmark so that we can
- benchmark compression codecs on random data. (acmurthy via omalley)
- HADOOP-1906. Warn the user if they have an obsolete madred-default.xml
- file in their configuration directory. (acmurthy via omalley)
- HADOOP-1971. Warn when job does not specify a jar. (enis via cutting)
- HADOOP-1942. Increase the concurrency of transaction logging to
- edits log. Reduce the number of syncs by double-buffering the changes
- to the transaction log. (Dhruba Borthakur)
- HADOOP-2046. Improve mapred javadoc. (Arun C. Murthy via cutting)
- HADOOP-2105. Improve overview.html to clarify supported platforms,
- software pre-requisites for hadoop, how to install them on various
- platforms and a better general description of hadoop and it's utility.
- (Jim Kellerman via acmurthy)
- Release 0.14.4 - 2007-11-26
- BUG FIXES
- HADOOP-2140. Add missing Apache Licensing text at the front of several
- C and C++ files.
- HADOOP-2169. Fix the DT_SONAME field of libhdfs.so to set it to the
- correct value of 'libhdfs.so', currently it is set to the absolute path of
- libhdfs.so. (acmurthy)
- HADOOP-2001. Make the job priority updates and job kills synchronized on
- the JobTracker. Deadlock was seen in the JobTracker because of the lack of
- this synchronization. (Arun C Murthy via ddas)
- Release 0.14.3 - 2007-10-19
- BUG FIXES
- HADOOP-2053. Fixed a dangling reference to a memory buffer in the map
- output sorter. (acmurthy via omalley)
- HADOOP-2036. Fix a NullPointerException in JvmMetrics class. (nigel)
- HADOOP-2043. Release 0.14.2 was compiled with Java 1.6 rather than
- Java 1.5. (cutting)
- Release 0.14.2 - 2007-10-09
- BUG FIXES
- HADOOP-1948. Removed spurious error message during block crc upgrade.
- (Raghu Angadi via dhruba)
- HADOOP-1862. reduces are getting stuck trying to find map outputs.
- (Arun C. Murthy via ddas)
-
- HADOOP-1977. Fixed handling of ToolBase cli options in JobClient.
- (enis via omalley)
- HADOOP-1972. Fix LzoCompressor to ensure the user has actually asked
- to finish compression. (arun via omalley)
- HADOOP-1970. Fix deadlock in progress reporting in the task. (Vivek
- Ratan via omalley)
- HADOOP-1978. Name-node removes edits.new after a successful startup.
- (Konstantin Shvachko via dhruba)
- HADOOP-1955. The Namenode tries to not pick the same source Datanode for
- a replication request if the earlier replication request for the same
- block and that source Datanode had failed.
- (Raghu Angadi via dhruba)
- HADOOP-1961. The -get option to dfs-shell works when a single filename
- is specified. (Raghu Angadi via dhruba)
- HADOOP-1997. TestCheckpoint closes the edits file after writing to it,
- otherwise the rename of this file on Windows fails.
- (Konstantin Shvachko via dhruba)
- Release 0.14.1 - 2007-09-04
- BUG FIXES
- HADOOP-1740. Fix null pointer exception in sorting map outputs. (Devaraj
- Das via omalley)
- HADOOP-1790. Fix tasktracker to work correctly on multi-homed
- boxes. (Torsten Curdt via cutting)
- HADOOP-1798. Fix jobtracker to correctly account for failed
- tasks. (omalley via cutting)
- Release 0.14.0 - 2007-08-17
- INCOMPATIBLE CHANGES
- 1. HADOOP-1134.
- CONFIG/API - dfs.block.size must now be a multiple of
- io.byte.per.checksum, otherwise new files can not be written.
- LAYOUT - DFS layout version changed from -6 to -7, which will require an
- upgrade from previous versions.
- PROTOCOL - Datanode RPC protocol version changed from 7 to 8.
- 2. HADOOP-1283
- API - deprecated file locking API.
- 3. HADOOP-894
- PROTOCOL - changed ClientProtocol to fetch parts of block locations.
- 4. HADOOP-1336
- CONFIG - Enable speculative execution by default.
- 5. HADOOP-1197
- API - deprecated method for Configuration.getObject, because
- Configurations should only contain strings.
- 6. HADOOP-1343
- API - deprecate Configuration.set(String,Object) so that only strings are
- put in Configrations.
- 7. HADOOP-1207
- CLI - Fix FsShell 'rm' command to continue when a non-existent file is
- encountered.
- 8. HADOOP-1473
- CLI/API - Job, TIP, and Task id formats have changed and are now unique
- across job tracker restarts.
- 9. HADOOP-1400
- API - JobClient constructor now takes a JobConf object instead of a
- Configuration object.
- NEW FEATURES and BUG FIXES
- 1. HADOOP-1197. In Configuration, deprecate getObject() and add
- getRaw(), which skips variable expansion. (omalley via cutting)
- 2. HADOOP-1343. In Configuration, deprecate set(String,Object) and
- implement Iterable. (omalley via cutting)
- 3. HADOOP-1344. Add RunningJob#getJobName(). (Michael Bieniosek via cutting)
- 4. HADOOP-1342. In aggregators, permit one to limit the number of
- unique values per key. (Runping Qi via cutting)
- 5. HADOOP-1340. Set the replication factor of the MD5 file in the filecache
- to be the same as the replication factor of the original file.
- (Dhruba Borthakur via tomwhite.)
- 6. HADOOP-1355. Fix null pointer dereference in
- TaskLogAppender.append(LoggingEvent). (Arun C Murthy via tomwhite.)
- 7. HADOOP-1357. Fix CopyFiles to correctly avoid removing "/".
- (Arun C Murthy via cutting)
- 8. HADOOP-234. Add pipes facility, which permits writing MapReduce
- programs in C++.
- 9. HADOOP-1359. Fix a potential NullPointerException in HDFS.
- (Hairong Kuang via cutting)
- 10. HADOOP-1364. Fix inconsistent synchronization in SequenceFile.
- (omalley via cutting)
- 11. HADOOP-1379. Add findbugs target to build.xml.
- (Nigel Daley via cutting)
- 12. HADOOP-1364. Fix various inconsistent synchronization issues.
- (Devaraj Das via cutting)
- 13. HADOOP-1393. Remove a potential unexpected negative number from
- uses of random number generator. (omalley via cutting)
- 14. HADOOP-1387. A number of "performance" code-cleanups suggested
- by findbugs. (Arun C Murthy via cutting)
- 15. HADOOP-1401. Add contrib/hbase javadoc to tree. (stack via cutting)
- 16. HADOOP-894. Change HDFS so that the client only retrieves a limited
- number of block locations per request from the namenode.
- (Konstantin Shvachko via cutting)
- 17. HADOOP-1406. Plug a leak in MapReduce's use of metrics.
- (David Bowen via cutting)
- 18. HADOOP-1394. Implement "performance" code-cleanups in HDFS
- suggested by findbugs. (Raghu Angadi via cutting)
- 19. HADOOP-1413. Add example program that uses Knuth's dancing links
- algorithm to solve pentomino problems. (omalley via cutting)
- 20. HADOOP-1226. Change HDFS so that paths it returns are always
- fully qualified. (Dhruba Borthakur via cutting)
- 21. HADOOP-800. Improvements to HDFS web-based file browser.
- (Enis Soztutar via cutting)
- 22. HADOOP-1408. Fix a compiler warning by adding a class to replace
- a generic. (omalley via cutting)
- 23. HADOOP-1376. Modify RandomWriter example so that it can generate
- data for the Terasort benchmark. (Devaraj Das via cutting)
- 24. HADOOP-1429. Stop logging exceptions during normal IPC server
- shutdown. (stack via cutting)
- 25. HADOOP-1461. Fix the synchronization of the task tracker to
- avoid lockups in job cleanup. (Arun C Murthy via omalley)
- 26. HADOOP-1446. Update the TaskTracker metrics while the task is
- running. (Devaraj via omalley)
- 27. HADOOP-1414. Fix a number of issues identified by FindBugs as
- "Bad Practice". (Dhruba Borthakur via cutting)
- 28. HADOOP-1392. Fix "correctness" bugs identified by FindBugs in
- fs and dfs packages. (Raghu Angadi via cutting)
- 29. HADOOP-1412. Fix "dodgy" bugs identified by FindBugs in fs and
- io packages. (Hairong Kuang via cutting)
- 30. HADOOP-1261. Remove redundant events from HDFS namenode's edit
- log when a datanode restarts. (Raghu Angadi via cutting)
- 31. HADOOP-1336. Re-enable speculative execution by
- default. (omalley via cutting)
- 32. HADOOP-1311. Fix a bug in BytesWritable#set() where start offset
- was ignored. (Dhruba Borthakur via cutting)
- 33. HADOOP-1450. Move checksumming closer to user code, so that
- checksums are created before data is stored in large buffers and
- verified after data is read from large buffers, to better catch
- memory errors. (cutting)
- 34. HADOOP-1447. Add support in contrib/data_join for text inputs.
- (Senthil Subramanian via cutting)
- 35. HADOOP-1456. Fix TestDecommission assertion failure by setting
- the namenode to ignore the load on datanodes while allocating
- replicas. (Dhruba Borthakur via tomwhite)
- 36. HADOOP-1396. Fix FileNotFoundException on DFS block.
- (Dhruba Borthakur via tomwhite)
- 37. HADOOP-1467. Remove redundant counters from WordCount example.
- (Owen O'Malley via tomwhite)
- 38. HADOOP-1139. Log HDFS block transitions at INFO level, to better
- enable diagnosis of problems. (Dhruba Borthakur via cutting)
- 39. HADOOP-1269. Finer grained locking in HDFS namenode.
- (Dhruba Borthakur via cutting)
- 40. HADOOP-1438. Improve HDFS documentation, correcting typos and
- making images appear in PDF. Also update copyright date for all
- docs. (Luke Nezda via cutting)
- 41. HADOOP-1457. Add counters for monitoring task assignments.
- (Arun C Murthy via tomwhite)
- 42. HADOOP-1472. Fix so that timed-out tasks are counted as failures
- rather than as killed. (Arun C Murthy via cutting)
- 43. HADOOP-1234. Fix a race condition in file cache that caused
- tasktracker to not be able to find cached files.
- (Arun C Murthy via cutting)
- 44. HADOOP-1482. Fix secondary namenode to roll info port.
- (Dhruba Borthakur via cutting)
- 45. HADOOP-1300. Improve removal of excess block replicas to be
- rack-aware. Attempts are now made to keep replicas on more
- racks. (Hairong Kuang via cutting)
- 46. HADOOP-1417. Disable a few FindBugs checks that generate a lot
- of spurious warnings. (Nigel Daley via cutting)
- 47. HADOOP-1320. Rewrite RandomWriter example to bypass reduce.
- (Arun C Murthy via cutting)
- 48. HADOOP-1449. Add some examples to contrib/data_join.
- (Senthil Subramanian via cutting)
- 49. HADOOP-1459. Fix so that, in HDFS, getFileCacheHints() returns
- hostnames instead of IP addresses. (Dhruba Borthakur via cutting)
- 50. HADOOP-1493. Permit specification of "java.library.path" system
- property in "mapred.child.java.opts" configuration property.
- (Enis Soztutar via cutting)
- 51. HADOOP-1372. Use LocalDirAllocator for HDFS temporary block
- files, so that disk space, writability, etc. is considered.
- (Dhruba Borthakur via cutting)
- 52. HADOOP-1193. Pool allocation of compression codecs. This
- eliminates a memory leak that could cause OutOfMemoryException,
- and also substantially improves performance.
- (Arun C Murthy via cutting)
- 53. HADOOP-1492. Fix a NullPointerException handling version
- mismatch during datanode registration.
- (Konstantin Shvachko via cutting)
- 54. HADOOP-1442. Fix handling of zero-length input splits.
- (Senthil Subramanian via cutting)
- 55. HADOOP-1444. Fix HDFS block id generation to check pending
- blocks for duplicates. (Dhruba Borthakur via cutting)
- 56. HADOOP-1207. Fix FsShell's 'rm' command to not stop when one of
- the named files does not exist. (Tsz Wo Sze via cutting)
- 57. HADOOP-1475. Clear tasktracker's file cache before it
- re-initializes, to avoid confusion. (omalley via cutting)
- 58. HADOOP-1505. Remove spurious stacktrace in ZlibFactory
- introduced in HADOOP-1093. (Michael Stack via tomwhite)
- 59. HADOOP-1484. Permit one to kill jobs from the web ui. Note that
- this is disabled by default. One must set
- "webinterface.private.actions" to enable this.
- (Enis Soztutar via cutting)
- 60. HADOOP-1003. Remove flushing of namenode edit log from primary
- namenode lock, increasing namenode throughput.
- (Dhruba Borthakur via cutting)
- 61. HADOOP-1023. Add links to searchable mail archives.
- (tomwhite via cutting)
- 62. HADOOP-1504. Fix terminate-hadoop-cluster script in contrib/ec2
- to only terminate Hadoop instances, and not other instances
- started by the same user. (tomwhite via cutting)
- 63. HADOOP-1462. Improve task progress reporting. Progress reports
- are no longer blocking since i/o is performed in a separate
- thread. Reporting during sorting and more is also more
- consistent. (Vivek Ratan via cutting)
- 64. [ intentionally blank ]
- 65. HADOOP-1453. Remove some unneeded calls to FileSystem#exists()
- when opening files, reducing the namenode load somewhat.
- (Raghu Angadi via cutting)
- 66. HADOOP-1489. Fix text input truncation bug due to mark/reset.
- Add a unittest. (Bwolen Yang via cutting)
- 67. HADOOP-1455. Permit specification of arbitrary job options on
- pipes command line. (Devaraj Das via cutting)
- 68. HADOOP-1501. Better randomize sending of block reports to
- namenode, so reduce load spikes. (Dhruba Borthakur via cutting)
- 69. HADOOP-1147. Remove @author tags from Java source files.
- 70. HADOOP-1283. Convert most uses of UTF8 in the namenode to be
- String. (Konstantin Shvachko via cutting)
- 71. HADOOP-1511. Speedup hbase unit tests. (stack via cutting)
- 72. HADOOP-1517. Remove some synchronization in namenode to permit
- finer grained locking previously added. (Konstantin Shvachko via cutting)
- 73. HADOOP-1512. Fix failing TestTextInputFormat on Windows.
- (Senthil Subramanian via nigel)
- 74. HADOOP-1518. Add a session id to job metrics, for use by HOD.
- (David Bowen via cutting)
- 75. HADOOP-1292. Change 'bin/hadoop fs -get' to first copy files to
- a temporary name, then rename them to their final name, so that
- failures don't leave partial files. (Tsz Wo Sze via cutting)
- 76. HADOOP-1377. Add support for modification time to FileSystem and
- implement in HDFS and local implementations. Also, alter access
- to file properties to be through a new FileStatus interface.
- (Dhruba Borthakur via cutting)