سنجش میزان رشد اخلاقی توسط دستیارهای محاسبه مبتنی بر هوش مصنوعی‌؛ چالش‌ها و راهبردها

نوع مقاله : مقاله پژوهشی

نویسندگان

1 دانشجوی دکتری مدرسی معارف اسلامی، مؤسسه آموزشی و پژوهشی امام خمینی (ره)، قم. ایران.

2 استاد دانشکده مهندسی کامپیوتر دانشگاه علم و صنعت تهران، تهران. ایران.

3 استادیار گروه اخلاق مؤسسه آموزشی و پژوهشی امام خمینی (ره)، قم. ایران.

چکیده

ظهور مدل‌های زبانی بزرگ (LLMs)، افق‌های تازه‌ای پیش روی دستیارهای هوشمند «محاسبه نفس‌» گشوده‌؛ با این حال‌، ماهیت انتزاعی مفاهیم اخلاقی‌، سنجش ماشینی آنها را با دشواری‌هایی مواجه ساخته است‌. این پژوهش‌،‌ ضمن استخراج‌، تحلیل و طبقه‌بندی چالش‌های فنی و مفهومیِ سنجش اخلاق در بستر «دستیارهای هوشمند محاسبه نفس»، در صدد تدوین راهبردهای نظری برای مدیریت آنها است‌. یافته‌ها در چارچوب تحلیلی چهاربخشی تبیین شده‌اند‌: ۱) چالش‌های ترجمۀ مفهومی به ساختار محاسباتی از قبیل عملیاتی‌سازی نیّت و فقدان هستان‌نگار‌های اخلاقی‌؛ ۲) چالش‌های داده‌محور همچون اتکا به ردپاهای دیجیتال‌، کمبود مجموعه داده‌های استاندارد و سوگیری فرهنگی‌؛ ۳) چالش‌های منطق الگوریتمی نظیر ناپایداری مدل‌، مسئلۀ جعبه‌سیاه و حساسیت به بیان‌؛ ۴) چالش‌های تعاملی و پویایی که شامل حلقه‌های بازخورد معیوب و خطای نسبت‌دهی زمانی است‌. پژوهش نتیجه می‌گیرد که غلبه بر این موانع مستلزم فراتر رفتن از رویکردهای صرفاً آماری است‌. راهبردها شامل توسعه نمایه‌های سنجش چندبعدی‌، طراحی معماری‌های ترکیبی با ادغام هوش مصنوعی تبیین‌پذیر‌، استنتاج علّی و هستان‌نگار‌های رسمی مبتنی بر اخلاق اسلامی‌، و گذار از مشاهده منفعلانه به خودارزیابی فعالانه و کاربرمحور است.

کلیدواژه‌ها


عنوان مقاله [English]

Assessing Moral Development in AI-Based Introspective Assistants: Challenges and Strategies

نویسندگان [English]

  • Majid Gholami 1
  • Behrooz Minaei Bidgoli 2
  • Hadi Hosseinkhani 3
1 Ph.D. Candidate in Islamic Teaching, Imam Khomeini Educational and Research Institute, Qom, Iran.
2 Professor, Faculty of Computer Engineering, Iran University of Science and Technology, Tehran, Iran.
3 Assistant Professor, Department of Ethics, Imam Khomeini Educational and Research Institute, Qom, Iran.
چکیده [English]

The emergence of Large Language Models (LLMs) has opened new horizons for intelligent "introspective assistants" (Mohâsebe-ye Nafs), yet the abstract nature of moral concepts complicates their computational assessment. This research extracts, analyzes, and classifies the technical and conceptual challenges of evaluating morality within such assistants and formulates theoretical strategies to manage them. Findings are structured into a four-part framework: 1) Conceptual Translation Challenges, such as operationalizing intention and lacking ethical ontologies; 2) Data-Centric Challenges, including reliance on digital footprints, scarce standardized datasets, and cultural bias; 3) Algorithmic Logic Challenges, like model instability, the black box problem, and sensitivity to phrasing; 4) Interactive and Dynamism Challenges, encompassing flawed feedback loops and temporal misattribution error. The study concludes that overcoming these obstacles requires moving beyond purely statistical approaches. Proposed strategies involve developing multidimensional assessment indices, designing hybrid architectures that integrate explainable AI, causal inference, and formal ontologies based on Islamic ethics, and shifting from passive observation to active, user-centric self-evaluation.

کلیدواژه‌ها [English]

  • Introspection (Mohâsebe-ye Nafs)
  • Moral Assessment
  • Large Language Models (LLMs)
  • Artificial Intelligence
  • Intelligent Introspective Assistants
منابع
الف) منابع فارسی
خمینی، روح‌الله موسوی. (۱۳۸۰). شرح چهل حدیث (اربعین حدیث). قم: مؤسسه تنظیم و نشر آثار امام خمینی (س).
عالم‌زاده نوری، محمد. (۱۳۹۶). استنباط حکم اخلاقى از متون دینى و ادله لفظى: بررسی چند چالش مهم در اصول لفظى فقه الاخلاق. (تحقیق: محمدباقر انصارى)، قم: پژوهشگاه علوم و فرهنگ اسلامى.
غزالی، محمد بن محمد. (بی‌تا). إحیاء علوم الدین. ج. ۱۵. بیروت: دار الکتب العربی.
مصباح یزدی، محمدتقی. (۱۳۹۲). پندهای صادق (ع) برای ره‌جویان صادق. (تحقیق: محمدمهدی نادری قمی)، قم: انتشارات مؤسسه آموزشی و پژوهشی امام خمینی (قدس سره).
مصباح یزدی، محمدتقی. (۱۳۹۶). ره‌توشه. ج. ۱. (تحقیق: کریم سبحانی)، قم: انتشارات مؤسسه آموزشی و پژوهشی امام خمینی (قدس سره).
مکارم شیرازی، ناصر و همکاران. (۱۳۸۵). اخلاق در قرآن. ج. ۱. (چاپ دوم). قم: مدرسه الامام علی بن ابی طالب (ع).
ب) منابع انگلیسی
Abrar, Ajwad., Oeshy, Nafisa Tabassum., Kabir, Mohsinul., & Ananiadou, Sophia. (2025). Religious Bias Landscape in Language and Text-to-Image Models: Analysis, Detection, and Debiasing Strategies. arXiv:2501.08441. doi: 10.48550/arXiv.2501.08441.
Aijaz, Aisha., Mutharaju, Raghava., & Kumar, Manohar. (2025). ApplE: An Applied Ethics Ontology with Event Context. arXiv:2502.05110. doi: 10.48550/arXiv.2502.05110.
Alsabah, Khaled. (2025). Love Me Do: Twitter Likes and Earnings Surprise. Journal of Behavioral Finance. 26(3), 283–302. doi: 10.1080/15427560.2023.2301070.
Armstrong, Andrea., Briggs, Jo., Moncur, Wendy., Carey, Daniel Paul., Nicol, Emma., & Schafer, Burkhard. (2023). Everyday digital traces. Big Data & Society. 10(2), 1–13. doi: 10.1177/20539517231213827.
Atil, Berk., Aykent, Sarp., Chittams, Alexa., Fu, Lisheng., Passonneau, Rebecca J., Radcliffe, Evan., Rajagopal, Guru R., Sloan, Adam., Tudrej, Tomasz., Ture, Ferhan., Wu, Zhe., Xu, Lixinyu., & Baldwin, Breck. (2025). Non-Determinism of "Deterministic" LLM Settings. arXiv:2408.04667. doi: 10.48550/arXiv.2408.04667.
Avalle, Michele., Di Marco, Niccolò., Etta, Gabriele., Sangiorgio, Emanuele., Alipour, Shayan., Bonetti, Anita., Alvisi, Lorenzo., Scala, Antonio., Baronchelli, Andrea., Cinelli, Matteo., & Quattrociocchi, Walter. (2024). Persistent interaction patterns across social media platforms and over time. Nature. 628(8008), 582–589. doi: 10.1038/s41586-024-07229-y.
Barbero, Fabio., op den Camp, Sander., van Kuijk, Kristian., Soto García-Delgado, Carlos., Spanakis, Gerasimos., & Iamnitchi, Adriana. (2023). Multi-Modal Embeddings for Isolating Cross-Platform Coordinated Information Campaigns on Social Media. arXiv:2309.12764. doi: 10.48550/arXiv.2309.12764.
Benton, Jack S., & French, David P. (2023). Untapped Potential of Unobtrusive Observation for Studying Health Behaviors. JMIR Preprints. doi: 10.2196/preprints.49150.
Bosch, Oriol J., Sturgis, Patrick., Kuha, Jouni., & Revilla, Melanie. (2025). Uncovering Digital Trace Data Biases: Tracking Undercoverage in Web Tracking Data. Communication Methods and Measures. 19(2), 157–177. doi: 10.1080/19312458.2024.2393165.
Braun, Virginia., & Clarke, Victoria. (2006). Using thematic analysis in psychology. Qualitative Research in Psychology. 3(2), 77–101. doi: 10.1191/1478088706qp063oa.
Cao, Bowen., Cai, Deng., Zhang, Zhisong., Zou, Yuexian., & Lam, Wai. (2024). On the Worst Prompt Performance of Large Language Models. arXiv:2406.10248. doi: 10.48550/arXiv.2406.10248.
Chen, Xinyun., Aksitov, Renat., Alon, Uri., Ren, Jie., Xiao, Kefan., Yin, Pengcheng., Prakash, Sushant., Sutton, Charles., Wang, Xuezhi., & Zhou, Denny. (2023). Universal Self-Consistency for Large Language Model Generation. arXiv:2311.17311. doi: 10.48550/arXiv.2311.17311.
Chi, Haoang., Li, He., Yang, Wenjing., Liu, Feng., Lan, Long., Ren, Xiaoguang., Liu, Tongliang., & Han, Bo. (2024). Unveiling Causal Reasoning in Large Language Models: Reality or Mirage?. arXiv:2406.21215. doi: 10.48550/arXiv.2406.21215.
Choenni, Rochelle., Lauscher, Anne., & Shutova, Ekaterina. (2024). The Echoes of Multilinguality: Tracing Cultural Value Shifts during Language Model Fine-Tuning. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (pp. 15042–15058). doi: 10.18653/v1/2024.acl-long.829.
Dhiman, Hitesh., Wächter, Christoph., Fellmann, Michael., & Röcker, Carsten. (2022). Intelligent Assistants: Conceptual Dimensions, Contextual Model, and Design Trends. Business & Information Systems Engineering. 64(5), 645–665. doi: 10.1007/s12599-022-00743-1.
Glickman, Moshe., & Sharot, Tali. (2024). How human–AI feedback loops alter human perceptual, emotional and social judgements. Nature Human Behaviour. 9(2), 345–359. doi: 10.1038/s41562-024-02077-2.
Guizzardi, Renata., Amaral, Glenda., Guizzardi, Giancarlo., & Mylopoulos, John. (2023). An Ontology-Based Approach to Engineering Ethicality Requirements. Software and Systems Modeling. 22(6), 1897–1923. doi: 10.1007/s10270-023-01115-3.
Guo, Hui., Yi, Grace Y., & Wang, Boyu. (2024). Learning from Noisy Labels via Conditional Distributionally Robust Optimization. arXiv:2305.15878. doi: 10.48550/arXiv.2305.15878.
Hinder, Fabian., Vaquet, Valerie., & Hammer, Barbara. (2024). One or two things we know about concept drift—a survey on monitoring in evolving environments. Part A: Detecting concept drift. Frontiers in Artificial Intelligence. 7, 1330257. doi: 10.3389/frai.2024.1330257.
Hutchinson, Ben. (2025). Modeling the Sacred: Considerations when Using Religious Texts in Natural Language Processing. arXiv:2404.14740. doi: 10.48550/arXiv.2404.14740.
Ibrahim, Shahana., Traganitis, Panagiotis A., Fu, Xiao., & Giannakis, Georgios B. (2025). Learning From Crowdsourced Noisy Labels: A Signal Processing Perspective. arXiv:2407.06902. doi: 10.48550/arXiv.2407.06902.
Jacobs, Abigail Z., & Wallach, Hanna. (2021). Measurement and Fairness. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency (pp. 375–385). doi: 10.1145/3442188.3445902.
Ji, Jianchao., Chen, Yutong., Jin, Mingyu., Xu, Wujiang., Hua, Wenyue., & Zhang, Yongfeng. (2025). MoralBench: Moral Evaluation of LLMs. arXiv:2406.04428. doi: 10.48550/arXiv.2406.04428.
Jiao, Licheng., Wang, Yuhan., Liu, Xu., Li, Lingling., Liu, Fang., Ma, Wenping., Guo, Yuwei., Chen, Puhua., Yang, Shuyuan., & Hou, Biao. (2024). Causal Inference Meets Deep Learning: A Comprehensive Survey. Research. 7, 0467. doi: 10.34133/research.0467.
Kapoor, Aditya., Swamy, Sushant., Tessera, Kale-ab., Baranwal, Mayank., Sun, Mingfei., Khadilkar, Harshad., & Albrecht, Stefano V. (2024). Agent-Temporal Credit Assignment for Optimal Policy Preservation in Sparse Multi-Agent Reinforcement Learning. arXiv:2412.14779.
Kıcıman, Emre., Ness, Robert., Sharma, Amit., & Tan, Chenhao. (2024). Causal Reasoning and Large Language Models: Opening a New Frontier for Causality. arXiv:2305.00050. doi: 10.48550/arXiv.2305.00050.
Krauth, Karl., Wang, Yixin., & Jordan, Michael I. (2022). Breaking Feedback Loops in Recommender Systems with Causal Inference. arXiv:2207.01616. doi: 10.48550/arXiv.2207.01616.
Li, Yuanchun., Wen, Hao., Wang, Weijun., Li, Xiangyu., Yuan, Yizhen., Liu, Guohong., Liu, Jiacheng., Xu, Wenxing., Wang, Xiang., Sun, Yi., Kong, Rui., Wang, Yile., Geng, Hanfei., Luan, Jian., Jin, Xuefeng., Ye, Zilong., Xiong, Guanjing., Zhang, Fan., Li, Xiang., Xu, Mengwei., Li, Zhijun., Li, Peng., Liu, Yang., Zhang, Ya-Qin., & Liu, Yunxin. (2024). Personal LLM Agents: Insights and Survey about the Capability, Efficiency and Security. arXiv:2401.05459. doi: 10.48550/arXiv.2401.05459.
Liu, Y., Luo, Y., Zhong, Y., Chen, X., Liu, Q., & Peng, J. (2019). Sequence modeling of temporal credit assignment for episodic reinforcement learning. arXiv preprint arXiv:1905.13420.
Marks, Jonathan., & Tegmark, Max. (2024). The Geometry of Truth: Emergent Linear Structure in Large Language Model Representations of True/False Datasets. arXiv:2310.06824.
Mei, Yifan., Jiang, Hong-Yi., & Bansal, Mohit. (2025). The Illusion of State in Language Models. arXiv:2405.01548.
Nazat, Ikram., Ben-Ammar, Oumayma., & El-Ghali, Adnane. (2024). Towards Explainable AI-driven Argument Mining: A Survey of Current Methods and Future Directions. In Proceedings of the 2024 4th International Conference on Information Technology and its Applications (ICITA).
Nie, Allen., Meshkini, Kiyan., & Gur-Ari, Guy. (2023). The Moral Integrity of Language Models. arXiv:2311.09.
Pagan, Brando., Sridhar, Divya., & Blei, David M. (2023). Taming Feedback Loops: A Causal Approach. In Proceedings of the 40th International Conference on Machine Learning. PMLR.
Pham, Duy., Hu, Tuan., Ng, See-Kiong., & Ng, Wee Keong. (2025). Continual Learning for Anomaly Detection: A Survey. ACM Computing Surveys.
Pignatelli, Francesco., Lyle, Clare., Arulkumaran, Kai., & Kirsch, Louis. (2024). The Temporal Credit Assignment Problem. arXiv:2407.13531.
Purcell, Dylan., & Bonnefon, Jean-François. (2023). People don’t want to be “scored” on their morality, but are fine with others being scored. PsyArXiv. doi: 10.31234/osf.io/c8g9e.
Ramezani, Najmeh., & Xu, Guandong. (2024). Moral-AssocGraph: A Knowledge Graph for Moral Association with Explanations. In Companion Proceedings of the ACM Web Conference 2024.
Ramesh, Aditya A., Kenny Young, Louis Kirsch, and Jürgen Schmidhuber. 2024. Sequence Compression Speeds Up Credit Assignment in Reinforcement Learning. arXiv preprint arXiv:2405.03878..
Rauba, Maciej., Tworkowski, Kacper., Junczys-Dowmunt, Marcin., & Al-Onaizan, Yaser. (2025). Probabilistic Answer Reranking for LLMs. arXiv:2406.12643.
Russell, Stuart J., & Norvig, Peter. (2021). Artificial Intelligence: A Modern Approach. (4th ed.). Pearson.
Russo, Francesco., & Vidal, Vincent. (2024). A Survey on Neurosymbolic Visual Reasoning. arXiv:2405.01194.
Shimizu, Cogan., & Hitzler, Pascal. (2024). Can Large Language Models and Humans Create Good Ontologies Together?. In Proceedings of the 21st International Semantic Web Conference.
Srikanth, Aditi., Sharma, Pratyush., & Das, D. (2024). When is a Language Model an Argumentative Reasoner? A Case Study on the Argument From Expert Opinion. arXiv:2405.13264.
Sun, Ziming., Wang, Weijun., Wang, Yile., Geng, Hanfei., Luan, Jian., Ye, Zilong., Jin, Xuefeng., Zhang, Fan., Li, Xiang., Liu, Yang., Liu, Yunxin., & Zhang, Ya-Qin. (2025). Truth-Bench: A Benchmark for Evaluating the Role of Ground Truth in LLMs’ Veracity. arXiv:2403.04883.
Turing, Alan M. (1950). Computing Machinery and Intelligence. Mind. LIX(236), 433–460.
Wooldridge, Michael., & Jennings, Nicholas R. (1995). Intelligent agents: Theory and practice. The Knowledge Engineering Review. 10(2), 115–152.
Zheng, Ren-Jie., He, Jian-Yun., & Tian, Yu. (2024). LLM-based Agents for Lifelong Learning. arXiv:2403.02381.
Zhou, Zhaofeng., Cui, Ganqu., & Hu, Z. (2024). Semantic Consistency for Robustness to Vague and Ambiguous Prompts. arXiv preprint arXiv:2402.04631.