Loosely coupled applications composed of a potentially very large number (from tens of thousands to even billions) of tasks are commonly used in high-throughput computing and many-task computing paradigms. To efficiently execute large-scale computations which can exceed the capability in a single type of computing resources within expected time, we should be able to effectively integrate resources from heterogeneous distributed computing (HDC) systems such as clusters, grids, and clouds. In this paper, we quantitatively analyze the performance of three different real scientific applications consisting of many tasks on top of HDC systems based on a partnership of distributed computing clusters, grids, and clouds to understand the application and resource characteristics, and show practical issues that normal scientific users can face during the course of leveraging these systems. Our experimental study shows that the performance of a loosely coupled application can be significantly affected by the characteristics of a HDC system, along with hardware specification of a node, and their impacts on the performance can vary widely depending on the resource usage pattern of each application. We then devise a preference-based scheduling algorithm that can reflect characteristics and resource usage patterns of various loosely coupled applications running on top of HDC systems from our experimental study. Our preference-based scheduling algorithm can allocate the resources from different HDC systems to loosely coupled applications based on the preferences of the applications for the HDC systems. We evaluate the overall system performance over various preference types, using trace-based simulations, which can be determined based on different factors such as CPU specifications and application throughputs. Our simulation results demonstrate the importance of understanding the application and resource characteristics on effective scheduling of loosely coupled applications on the HDC systems.