Abstract:
India has a significant shortage of trained physicians nationwide with a physician to population ratio of only 1:1674. Thus, identifying critical physicians is imperative as the proper diffusion of medical information to these physicians is of utmost importance. To identify critical physicians, one first needs to understand the network structure of the
physician's social network as well as the underlying dynamics of adoption. The main contribution of this thesis is to understand how the network structure and the underlying network dynamics affect the diffusion of innovation that takes place inside physician social
networks and to develop novel strategies for identifying critical physicians to accelerate the diffusion process. In the first experiment, a binary approach and a weighted approach was proposed for creation of physician social networks. These approaches relied upon the
similarity between physician attributes to assign weights to relationships. Results indicated that the weighted approach had a higher accuracy compared to the binary approach in predicting a physician’s medicine adoption. While network structures play a pivotal role in modeling information diffusion, one also needs to understand the underlying network dynamics. In the second experiment, the
effect of network dynamics on the diffusion of innovation was investigated. First, innovation diffusion was analyzed from a time-series perspective. Diffusion (Roger’s and Bass model), statistical (seasonal autoregressive integrated moving average), and machine learning (linear regression and gradient boosting regression) models were developed for predicting growth in the number of adopters of pain medications. The best performance was obtained by the timeseries models, followed by machine learning and diffusion models. Second, innovation diffusion process inside a physician's social network was analyzed by predicting information cascades using multi-layer perceptron (MLP) and long short-term memory (LSTM) neural
networks. A systematic evaluation of different graph embedding techniques and the effect of embedding dimensions on the prediction of information cascades was also performed. Results indicated that the embedding techniques that preserved both the first-order and second-order proximity performed better in cascade prediction compared to those that only preserved one of the proximity measures. Furthermore, MLPs performed better compared to LSTMs in predicting information cascades. Next, the third experiment investigates the effect of communication channels like electronic word of mouth (eWOMs) communications (e.g., tweets) on the innovation diffusion process. A sentiment analysis of tweets of followers and non-followers of two rare
disease medication manufacturers and their respective medications was performed to understand their effect on the overall perception. Results indicated a counter-intuitive finding: there was no significant difference in the average sentiment of followers and non-followers
regarding rare disease medications, indicating that followers may not possess positive sentiments.
Finding critical physicians may be summarized as an influence maximization (IM) problem, which aims to select a subset of physicians from an influence social graph, such that the diffusion of information is maximized. In the fourth experiment, a reinforcement learning
(RL)-based framework for solving the IM problem was proposed. The RL framework consisted of an edge-based graph neural network (GNN) that generated the node embedding, which were then fed to a double deep Q-network (DDQN) to learn a Q-function to predict the
solution set. The edge-based GNN was an ensemble architecture comprising of structure2Vec, multi-headed self-attention, and edge enhanced graph neural network (EGNN). The framework was trained on social graph with 20% of its edges randomly removed and tested on the whole graph. Results revealed that the framework was able to
generalize to an unknown graph and gave a spread difference of 8% compared to a heuristic IM algorithm implemented on the whole graph.
In the fifth experiment, the problem of online influence maximization (OIM) was investigated, where OIM is a variant of the IM problem. The objective of OIM is to identify critical physicians in the absence of influence probabilities in a social graph. A new explore exploit ensemble approach based on the Exponential-weight, Exploration, and Exploitation using Expert advice (EXP4) algorithm was proposed for solving the OIM problem. Results indicated the ensemble approach performed better compared to other current OIM algorithms.
Lastly, in the sixth experiment, the problem of volume maximization (VM) and different algorithmic solutions to the VM problem were investigated. The objective of the VM problem was to select a set of physicians who are both influential as well as have frequent interactions with patients. For solving the VM problem, these two frameworks were proposed: a reinforcement learning framework that developed Q-learning and SARSA models; and an instance-based learning (IBL) framework that developed IBL models. A
greedy based algorithm called weighted-CELF was also proposed for the VM problem. The proposed models were compared with multiple IM algorithms like PMIA. Results revealed that the weighted-greedy and weighted-CELF algorithm gave the best weighted influenced
spreads while PMIA gave the best influence spread. In contrast, the RL-frameworks: Qlearning and SARSA gave good weighted influence spread, and influence spread different initial seed set sizes (k). This thesis highlights the utility of the proposed approaches in
identifying critical physicians and shows the effect of network structure and the underlying dynamics on the diffusion of innovation in a physician’s social networks.