> messages into one network message. For applications cases, sometimes it > helps with performance and sometimes it does not. OSU have shown both when would a program deliberately send such messages? isn't it something that the program should avoid in the first place? does the MPI optimization apply to messages that differ in source/dest rank and/or tag?