Heterogeneous manycores comprised of CPUs, GPUs and accelerators are putting stringent demands on network-on-chips (NoCs). The NoCs need to support the combined traffic, including both latency-sensitive CPU traffic and throughput-sensitive GPU and accelerator traffic. We study the characteristics of the combined traffic, and observe that (1) the limited injection bandwidth is the main obstacle to throughput improvement, and (2) the latency due to local and global contention accounts for a significant portion of the network latency. We propose a router architecture for heterogeneous manycores. The proposed router architecture introduces two new optimizations: (1) increasing injection bandwidth to improve throughput, and (2) resolving local and global contention to reduce network latency. Specifically, our design increases the injection bandwidth through modifications to injection link, crossbar switch and buffer organization in the injection port of the router; our design identifies the upcoming local contention and resolves it by optimally selecting traffic routes; our design also utilizes a supervised learning engine to detect the global contention through traffic analysis and prediction, and alleviate it by adaptively adjusting traffic injection. Simulation results using the Rodinia benchmark show that the proposed router architecture provides 28% throughput increase, 24% latency reduction, 22% execution time speedup, and 19% energy efficiency improvement, compared to the baseline router.